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ABSTRACT 


The  Selective  Reenlistment  Bonus  (SRB)  program  is  designed  to  offer  an 
attractive  reenlistment  incentive  to  improve  manning  in  critical  skills.  To  efficiently 
manage  the  SRB  program,  a  requirement  exists  to  maintain  MOS  level  estimating 
factors  for  use  in  projecting  retention  rate  improvement  as  a  function  of  SRB  award 
level.  This  thesis  formulates  and  solves  a  mathematical  model  which  explains  the 
variation  in  zone  A  retention  rates  as  a  function  of  SRB  award  level  and  other  factors 
believed  significant  in  the  reenlistment  decision. 

To  allow  for  comparison  of  the  estimating  factors  associated  with  the  SRB 
variable  across  MOS,  an  overall  projection  model  was  developed.  Stepwise  multiple 
linear  regression  analysis  techniques  were  used  on  a  subset  of  the  enlisted  MOS 
inventory'  in  the  model  development  phase  of  this  analysis.  The  proposed  overall 
model  was  then  fitted  to  a  second  subset  of  MOS  to  validate  the  assumptions  and 
effectiveness  of  the  proposed  linear  model.  <  .  <>  '  :  .  . 
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I.  INTRODUCTION 


The  Commander,  United  States  Army  Military  Personnel  Center 
(MILPERCEN),  is  responsible  for  developing  and  issuing  policies,  standards  and 
procedures  in  the  administration  of  the  Selective  Rccniistmcnt  bonus  (SRB)  program. 
The  SRB  program  is  designed  to  oiler  an  attractive  rccniistmcnt  incentive  to  improve 
manning  in  the  most  critical  skills.  A  primary  consideration  in  the  management  of  the 
SRB  program  is  the  historic  effectiveness  of  an  SRB  in  improving  retention  in  a 
particular  skill.  In  this  study,  the  problem  of  measuring  the  historic  effectiveness  of  the 
SRB  program  is  modelled  and  solved  using  stepwise  and  ordinary  least  squares  multiple 
linear  regression  analysis. 

A.  PROBLEM  STATEMENT 

The  Commander,  MILPERCEN  must  recommend  to  the  Deputy  Chief  of  Stall’ 
for  Personnel  (DCSPER)  those  Military  Occupational  Specialties  (MOS)  winch  should 
be  included  in  the  SRB  program.  I  he  criteria  used  to  determine  which  MOS  should  be 
included  in  the  SRB  program  are  outlined  in  the  form  of  several  guidelines  (specifically, 
Title  37  United  States  Code,  section  308,  Department  of  Defense  (DOD)  Directive 
1304.21  and  DOD  Directive  1304.22).  Some  criteria,  such  as  replacement  training 
costs,  are  easily  quantified.  Other  criteria,  such  as  the  relative  unaiuactivcncss  of  each 
MOS  compared  to  other  military  and  civilian  skills,  arc  much  more  subjective. 

One  criterion  upon  which  the  decision  to  include  a  particular  MOS  in  the  SRB 
program  is  based  is  the  projected  improvement  in  retention  in  response  to  the  bonus 
awarded.  There  must  be  a  reasonable  prospect  of  enough  improvement  in  retention  to 
justify  the  projected  cost  of  the  bonus.  Therefore,  a  requirement  exists  to  maintain 
estimating  factors  for  use  in  projecting  retention  rate  improvement  as  a  function  of 
SRB  award  level.  DOD  directs  that  these  factors  be  developed  from  actual  experience 
under  the  SRB  program. 

The  improvement  factors  currently  available  arc  outdated  and  were  developed 
without  consideration  to  certain  variables  believed  critical  to  an  accurate  projection  of 
retention  at  the  MOS  level. 


B.  BACKGROUND 

In  September  19S1,  the  DCSPER.  requested  that  the  Commander,  United  States 
Army  Concepts  Analysis  Agency  (CAA)  establish  a  study  group  to  develop  an 
improved  methodology  lor  allocation  of  SRB  funds.  An  intermediate  goal  of  the  studv 
group  was  to  quantify  the  effect  of  SRB  on  retention;  that  is,  develop  a  set  of 
historically  based  improvement  factors.  These  factors  were  to  replace  similar 
improvement  factors  published  by  the  Rand  Corporation  in  September  1977  |Ref.  1], 
I  he  DCSPER  suggested  that  the  Rand  factors  were  no  longer  valid,  in  light  of  more 
recent  trends  in  retention,  pay  and  civilian  perception  of  military  service. 

In  August  19S2.  the  study  was  completed  by  CAA.  Included  in  their  final  report 
[Rel.  2)  were  a  set  of  MOS  and  rcenlistmcnt  zone  specific  SRB  effectiveness  factors. 
These  factors  were  said  to  represent  the  net  change  in  retention  rate  for  a  given  MOS 
brought  on  by  a  change  in  the  SRB  authorized  that  MOS.  The  factors  were  act  nail  v 
the  estimated  regression  coefficients  of  the  carrier  variable  SRB  in  the  multiple  iineur 
regression  model  used  to  explain  retention  rate  behavior  for  all  MOS  Turin  a  the 
previous  live  years.  The  specific  model  follows: 

v  =  Py  +  Hi-Xj  J32X2  +  P3X3  +  P4X32  (1.1) 

+  P5X33  +  UjZj  +  aOM  +  c 

where: 

Y  =  retention  rate 
X |  =  SRB  multiplier 
X->  =  year 

X3  =  calender  quarter 
'/ 1  =  unemployment  rate 
/.->  -  Consumer  Price  Index 

c  =  error  component  with  assumed  distribution  NY  0,  <r2  ). 

While  the  study  group  cautioned  against  using  the  retention  improvement  factors 
(estimated  regression  coefficient  bj)  for  longer  than  two  years,  no  provisions  were 
made  lor  the  periodic  re-estimation  of  those  coefficients.  Hence,  the  current  set  of 
coefficients  arc  a  function  of  data  which  tire  at  least  live  years  old.  Additionallv,  while- 
diagnostics  from  the  CAA  model  support  a  reasonably  good  lit  to  the  data  available. 
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no  attempt  was  made  by  the  CAA  analysts  to  account  for  the  effects  of  factors  such  as 
population  demographics  and  promotion  opportunity. 

The  Deputy  Chief  of  Staff  for  Plans  (DCSPLAXS),  MILPERCEX  submitted  tins 
problem,  with  the  below  stated  objectives,  to  the  Naval  Postgraduate  School,  pursuant 
to  a  special  thesis  study  /  management  program.  Under  this  program,  a  participating 
Army  student  works  with  MILPERCEN  to  resolve  a  current  problem  and  receives  a 
follow-on  assignment  to  the  Personnel  Center  upon  graduation.  All  research  costs  and 
other  costs  associated  with  thesis  preparation  are  borne  by  MILPERCEX. 

C.  STUDY  OBJECTIVE 

The  objective  of  this  study  is  to  formulate  a  mathematical  mode!  which  explains 
the  variation  in  zone  A  enlisted  retention  rates  over  time  at  the  MOS  level  of  detail. 
Variables  representing  promotion  opportunity  to  grades  E5  and  E6  and  a  variable 
representing  SRB  award  ievcl  are  to  be  considered  as  candidate  explanatory  variables. 

D.  MODEE  AND  SOLUTION  APPROACH 

The  mathematical  formulation  proposed  in  this  study  is  an  ordinary  least  squares 
multiple  linear  regression  model  with  higher  order  terms.  It  is  our  intention  to 
carefully  select  our  dependent  and  independent  variables  so  that  the  model  can  be  used 
in  a  predictive  manner:  given  a  set  of  outcomes  on  the  explanatory  variables,  we  wish 
to  predict  an  outcome  on  our  selected  response  variable  with  a  mcasureable  degree  of 
precision. 

Our  objective  is  to  build  a  model  which  can  predict  zone  A  retention  at  the  MOS 
level.  It  is  likely  therefore,  that  if  each  MOS  subpopulation  were  studied 
independently,  the  carrier  variables  included  in  the  final  model  (selected  by  some 
swem  o|  rules)  would  not  be  identical  for  each  MOS.  This  situation,  for  our 
purposes,  is  not  acceptable. 

1  lie  intentions  of  our  user  dictate  that  we  select  a  best  model  and  apply  it  for  all 
MoS.  As  has  already  been  mentioned,  the  SRB  managers  have  used  the  estimated 
coellicicnt  ol  the  carrier  variable  SRB  (we  refer  to  this  estimate  henceforth  simple  as 
ie  j )  to  compare  the  effects  on  retention  of  varying  the  SRB  level  across  several,  or  even 
all.  MOS.  Mosteller  and  Tukcy  [Ref.  3:  pp.  3 1 5-3 3 1 1  warn  that  the  coellicient  of  a 
carrier  is  very  dependent  on  it's  costock.  In  our  case,  we  will  attempt  to  construct  a 
mode!  so  that  the  carrier  variable  representing  SRB  is  unrelated  to  any  variable  in  the 
costock.  The  interpretation  of  the  estimated  coellicicnt  as  the  effect  of  SRB  level 
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changing  while  cosine  k  variables  keep  their  same  values  is  then  reasonable  at  the  MOS 
level.  For  comparisons  to  be  made  across  Jiiferen:  MO*'  nowever.  we  must  use  the 
same  model  for  all  MOS.  While  such  a  solution  appm.ch  has  the  disadvantage  of 
suboptimizing  our  pre  'action  capability  at  the  MoS  level,  it  lias  the  large  advantage  of' 
permitting  a  reasonably  valid  comparison  of  the  -ciutive  effectiveness  ol'SRB  across  a 
group  of  MOS. 

From  the  perspective  of  the  user,  the  overall  model  approach  oilers  two  other 
distinct  advantages.  First,  it  offers  simplicity.  The  managers  who  will  be  responsible 
to  implement  and  maintain  this  model  are  not  operations  analysts  and  will  resist 
integrating  a  complicated  model  ,  procedure  into  an  already  busy  schedule.  Second,  an 
overall  model  offers  credibility.  It  would  be  very  difficult  to  explain  to  non-analysts 
why  a  particular  carrier,  say  Consumer  Price  Index,  is  pertinent  to  the  recnlistment 
decision  of  a  soidier  in  one  MOS,  but  not  in  another. 

An  outline  of  the  steps  included  in  our  modelling  and  solution  approach  follows. 
It  is  consistent  with  a  methodology  recommended  by  Draper  and  Smith  [Ref.  4:  p. 
414). 

1  Define  the  problem.  Select  a  response  variable.  Surest  relevant  carrier 
variables. 

2  Can  we  obtain  a  complete  set  of  observations  on  all  specified  carrier  variables 
and  the  selected  i  espouse  variable?  II  not,  return  to  step  (1).  Otherwise, 
continue. 

3  Establish  model  aoals.  Consider  the  minimum  /  maximum  number  of  included 
carrier  variables  desired  and  determine  the  desired  level  of  statistical  significance 
lor  the  estimated  coefficients  of  each. 

4  Construct  a  correlation  matrix.  Guard  against  including  carriers  which  are  hiithlv 
correlated. 

5  Conduct  independent  multiple  linear  stepwise  regression  analvsis  for  each  MOS 

included  in  the  studv.  examine  the  residuals  for  support  of  the  model 

assumptions.  Arc  the 'models  adequate?  If  not,  return  to  step  (I),  Otherwise, 
continue. 

{>  Propose  an  overall  linear  regression  model. 

7  Conduct  ordinary  least-squares  multiple  linear  regression  analvsis  for  each  MOS 
included  in  the  studv.  Fxaminc  the  residuals  for  support  of  the  model 

assumptions.  Is  the  model  adequate?  If  not,  return  to  step  (6).  Otherwise, 

continue. 

S  Are  the  coefficients  reasonable?  Is  the  model  plausible?  Is  the  equation  usable? 
II  not,  return  to  step  ( I )  or  (6)  as  appropriate. 
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INITIAL  ASSUMPTIONS 

Some  further  assumptions  should  be  addressed.  We  assume  that  an  individual's 
propensity  to  reenlist  is  a  function  of  many  variables,  both  personal  and 
environmental.  We  assume  that  it  is  possible  to  formulate  a  mathematical  model 
which  estimates  the  propensity  of  individuals  to  recnlist  at  the  MOS  level.  While  this 
assumption  is  driven  by  a  user  requirement  for  an  MOS  level  model,  it  is  not  an 
unreasonable  one.  The  assumption  implies  that  individuals  in  the  same  MOS  behave 
similarly  with  respect  to  the  factors  which  affect  their  reenlistmcnt  decision.  It  also 
allows  that  soldiers  in  different  MOS  may  have  different  perceptions  of  the 
environment  in  which  they  make  their  reenlistmcnt  decision.  T  hese  implications  can  be 
justified  with  respect  to  the  Unlisted  Personnel  Management  System  (1IPMS).  The 
duties  and  training  required  of  each  MOS  are  associated  with  different  civilian  skills. 
Also,  the  general  qualifications  and  skills  of  the  MOS  subpopulations  are  sorted  at 
enlistment.  For  example,  the  mean  Armed  Forces  Qualification  Test  (AFQT)  score  lor 
one  MOS  is  not  the  same,  nor  is  it  intended  to  be  the  same,  as  any  other  MOS.  TPMS 
establishes  the  MOS  as  the  basic  unit  of  personnel  inventory  management.  It  is  not 
only  the  required  level,  but  also  the  logical  level  at  which  to  conduct  this  study. 

We  must  also  assume  for  the  purposes  of  this  study  that  FPMS  remains 
relatively  stable.  Further,  we  assume  that  the  socio-economic  environment  in  which 
the  soldier  makes  a  reenlistmcnt  decision  is  stable  (within  the  norms  established  in  the 
historic  scope  of' this  study). 

F.  TTIFSIS  OUTLINH 

This  thesis  formulates  and  develops  a  mathematical  model  which  explains  the 
variation  in  /one  A  retention  at  the  MOS  level.  In  Chapter  II,  a  brief  overview  of  the 
SRB  program  is  presented.  In  Chapter  III,  the  assumptions  and  analysis  leading  to  the 
development  of  an  overall  model  arc  explained.  In  Chapter  IV,  the  results  of  lilting 
the  proposed  overall  model  to  the  available  data  are  presented  and  discussed.  Finally. 
Chapter  V  includes  the  conclusions  and  recommendations  of  this  study. 

G.  PROGRAMMING  LANGL'AGIiS  AND  STATISTICAL  PACKAGFS. 

All  programming  associated  with  data  collection  and  manipulation  was 
completed  using  FORTRAN  77  code.  All  data  analysis  and  most  graphics  were 
completed  using  the  SAS,  version  V,  statistical  package.  T  hese  choices  were  made  with 
respect  to  the  current  capabilities  and  assets  of  the  Military  Personnel  Center. 


II.  THE  SELECTIV  E  REENLISTMENT  BONUS  PROGRAM 


This  Chapter  presents  a  brief  overview  of  the  Selective  Rcenlistment  Bonus 
(SRB)  program.  Criteria  for  including  MOS  in  the  program  are  outlined,  as  arc  the 
eligibility  requirements  and  payment  procedures.  Finally,  the  budget  history  of  the 
program  is  graphically  summarized. 

A.  THE  OBJECTIVE 

The  Selective  Rcenlistment  Bonus  program  is  designed  to  oiler  an  attractive 
reenlistment  incentive  to  improve  manning  in  critical  military  specialties. 

B.  CRITERIA  FOR  INCLUDING  MOS  IN  THE  SRB  PROGRAM 

As  has  been  previously  noted,  there  are  many  criteria  considered  before 
including,  or  excluding  an  MOS  from  the  SRB  program.  Among  these  factors  are: 

1  a  comparison  of  career  manning  requirements  with  projected  inventory, 

2  the  cost  of  formal  school  training  for  replacement  personnel, 

3  the  expected  increase  in  retention  as  a  result  of  inclusion  in  the  SRB  program, 

4  the  priority  of  MOS  in  terms  of  it's  essentiality  to  the  Army  mission, 

5  the  inherent  unattractiveness  of  the  MOS  with  respect  to  other  military  and 
civilian  occupations. 

C.  ZONES  OF  ELIGIBILITY 

There  are  three  zones  of  individual  SRB  eligibility.  They  are: 

1  zone  A,  which  applies  to  those  service  members  who  have  completed  at  least  21 
months  ol  continuous  active  duty  but  not  more  than  6  years  of  active  dutv  on 
the  day  ol  rcenlistment. 

2  zone  B,  which  applies  to  those  service  members  who  have  completed  at  least  6 
but  no  more  than  10  years  of  active  duty  on  the  day  of  rcenlistment. 

3  zone  C,  which  applies  to  those  service  members  who  have  completed  at  least  1  < > 
but  no  more  than  14  years  of  active  duty  on  the  day  of  rcenlistment. 

D.  THE  AMOUNT  OF  BONUS  AND  METHOD  OF  PAYMENT 
1.  Amount  of  Bonus 

The  rcenlistment  bonus  to  which  a  service  member  is  entitled  upon 
rcenlistment  is  computed  as  follows: 


SRB  =  (monthly  base  pay)  x  (years  of  additional  obligated  service)  (2.1) 

x  (SRB  level) 

where  the  SRB  multiplier  can  assume  values  of  0.  1.  2.  3,  4.  or  5.  No  more  than  one 
SRB  is  authorized  per  soldier  per  zone.  No  SRB  can  exceed  S20.000.00. 

2.  Method  of  Payment 

Upon  qualification  for  award  of  an  SRB.  a  service  member  receives  50%  of 
the  authorized  SRB  on  the  day  of  recnlistmcnt,  and  the  balance  in  equal  annual 
installments  on  the  anniversary  of  the  recnlistmcnt  during  the  recnlistmcnt  contract 
period. 

H.  INDIVIDUAL  ELIGIBILITY  CRITERIA  LOR  ENLISTED  SERVICE 
MEMBERS. 

1  lie  individual  eligibility  ciitcria  for  service  members  is  as  prescribed  in  Army 
Regulation  (ARd  000-20<»  and  AR  601-2S0. 

F.  PAYMENT  EXPERIENCE 

As  is  indicated  above,  the  amount  of  the  SRB  award  to  which  an  individual  is 
entitled  is  a  function  of  three  factors:  SRB  award  level,  individual  monthly  base  pay. 
and  years  of  additional  obligated  service  incurred  as  a  result  of  the  contract.  The  two 
following  graphics  are  included  to  provide  the  reader  with  a  feel  for  the  scope  of  the 
problem.  At  Figure  2.1,  the  horizontal  axis  lists  fiscal  years  while  the  vertical  axis  is 
scaled  to  measure  the  total  number  of  zone  A  SRB  takers  for  each  year.  At  Figure  2.2. 
the  horizontal  axis  again  represents  fiscal  years,  but  the  vertical  axis  represents  the 
total  zone  A  SRB  expenditures  for  each  year.  Wc  note  that  both  bonus  takers  and 
expenditures  were  at  a  low  point  in  FYS3.  We  note  also  that  while  the  total  number  of 
zone  A  bonus  takers  has  increased  over  the  last  2  years,  the  total  expenditures  have 
not.  The  underlying  cause  of  this  trend  is  that,  in  general,  reenlistment  bonuses  are 
available  to  more  eligible  soldiers,  but  at  a  lower  level. 
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Figure  2.1  Zone  A  SRB  'l  akers,  IT81-F  YS5 
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III.  MODEL  FORMULATION 


In  tins  Chapter.  the  assumptions  and  analysis  leading  to  the  development  of  an 
overall  model  are  explained,  hirst,  the  basic  multiple  linear  regression  model  is 
proposed  ;n  matrix  notation.  Then  a  response  variable  and  a  set  of  candidate  carrier 
variables  are  suggested.  A  sampling  period  is  defined  for  use  in  estimating  parameters 
associated  with  tire  proposed  variables.  The  problems  encountered  in  data  collection 
and  data  preparation  are  discussed.  The  results  of  independent  stepwise  regression 
analysis  on  each  oh  the  included  MOS  are  explained.  Finally,  an  overall  multiple  linear 
regression  model  is  proposed. 

A.  PROPOSED  LINEAR  MODEL  IN  MATRIX  FORM 

In  this  thesis,  wc  assume  that  there  exists  a  relationship  between  the  propensity 
of  a  soldier  to  rectrlist  and  that  soldier's  perception  of  the  environment.  A  reliable 
method  of  analysis  to  examine  the  nature  of  the  relationship  between  our  proposed 
response  variable  (some  measure  of  retention  rate)  and  our  candidate  carrier  variables 
(which  will  attempt  to  account  for  changes  in  the  makeup  or  environment  of  the 
recniistmcnt  (decision-maker)  is  the  method  of  least  squares,  or  regression  analysis. 
Using  this  method  of  analysis,  wc  will  attempt  to  fit  the  following  multiple  linear 
regression  model  to  the  data  we  collect  for  each  MOS: 


Y  =  Xp  +  c 


(3.1) 


where: 

Y  is  an  (n  x  1)  vector  of  observations  on  the  selected  response  variable 
X  is  an  (n  x  p)  matrix  of  observations  on  the  selected  carrier  variables 
(I  is  a  (p  x  1 )  vector  of  parameters  to  be  estimated 

c  is  an  (n  x  1)  vector  of  errors  assumed  to  have  the  distribution  Nt  u.  cM  ) 

It  is  shown  [Ref.  d:  pp.  S6-S7j  that  if  X'X  is  non-singular,  the  least  squares 
estimate  of  p,  call  it  b,  can  be  written  as: 


b  =  (X'X)X'Y  ,  (3.2) 

1  J 

with  vanancc-covariancc  matrix  (X'X)  cx  Thus,  the  variance  associated  with 
estimating  any  particular  coelTicient  is  given  by: 

V(bj)  =  c^tr2  ‘(3.3) 

where  c-  is  the  diagonal  element  in  (X'X)*^  corresponding  to  ith  variable.  Further,  a 
prediction  of  Y  at  Xy  is  given  by: 


Y0  =  b'X0 


with  variance  aiven  bv: 


V(Y())  =  Xo'fX'Xr'x^tr2). 


B.  SFLOCTIOX  OF  THE  RESPONSE  VARIABLE 

We  have  assumed  that  MOS  subpopulations  can  be  treated  as  discrete  groups 
with  respect  to  their  propensity  to  recnlist.  Therefore,  it  follows  that  if  the  variables 
relevant  to  the  reenlistmcnt  decision  were  known,  and  their  levels  could  be  fixed,  or 
considered  fixed  for  a  period  of  time,  the  reenlistmcnt  propensity  of  these  discrete 
groups  could  also  be  considered  fixed.  Let  us  assume  that  these  propensities  are 
probabilities.  Then,  since  a  soldier  either  does  (1)  or  does  not  (0)  recnlist,  over  a 
period  of  time  we  will  observe  outcomes  on  repeated  bcrnoulli  trials  with  lived 
parameter  p. 

If  we  further  assume  these  observations  are  independent,  then  we  can  use  the 
maximum  likelihood  estimator  for  parameter  p  (p  =  number  of  rccnlistments  obsersed 
number  of  trials).  Hence,  one  method  for  obtaining  an  estimate  of  the  reenlistmcnt 
propensity  for  a  given  MOS  is  to  observe  outcomes  on  the  reenlistmcnt  decision  for  a 
period  of  time  short  enough  so  that  relevant  conditions  may  be  fixed  nr  considered 
fixed,  yet  long  enough  to  obtain  a  sample  si/e  which  will  enable  us  to  discern  small 
changes  in  the  population  parameter. 
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The  purpose  of  the  SRB  program,  as  stated  in  Chapter  II,  is  to  improve  manning 
in  critical  military  specialties.  An  SRB  can  be  considered  effective  in  2  ways.  First,  an 
SRB  can  induce  a  soldier  to  rccnlist  for  his  own  MOS.  who  may  otherwise  have  left  the 
service.  Second,  it  can  induce  a  soldier  to  rccnlist  for  his  own  MOS,  who  may 
otherwise  have  reeniisted  for  training  in  another  specialty.  In  conjunction  with 
program  managers  at  MILPFRChN,  the  following  retention  (vice  reclassification)  rate 
has  been  developed  for  use  as  the  response  variable  in  this  study: 

Y  =  retention  rate  =  propensity  of  a  soldier  to  rccnlist  for  his  own  MOS. 

It  is  estimated  by: 

A 

Y  =  estimated  retention  rate  =  number  of  soldiers  reenlisting  for  their  own  MOS  , 
number  of  soldiers  eligible  to  do  so. 

a 

Obviously  excluded  from  our  estimator  Y  (not  included  in  either  numerator  or 
denominator  expressions)  are  sen  ice  members  who  are  not  fully  eligible  for 
rcenlistment  at  the  decision  point.  An  SRB  cannot  induce  an  otherwise  ineligible 
soldier  to  reenlist.  Also  excluded  are  rcenlistments  which  occur  outside  the  window  of 
eligibility  (0  months  for  first  term  soldiers,  3  months  otherwise)  and  all  extensions. 
These  actions,  while  not  independent  of  the  elfects  of  the  SRB  program,  occur  for 
exceptional  reasons  unrelated  to  the  SRB  award  level.  Soldiers  who  rccnlist,  but 
reclassify  in  conjunction  with  rcenlistment,  are  not  counted  in  the  numerator  of  our 
estimator,  but  are  included  in  the  denominator. 

Retention  data  is  available  at  the  individual  soldier  level  on  mass  storage  at 
M I TPFRCTN.  However,  owing  to  significant  changes  in  the  manner  in  which  these 
data  were  recorded  prior  to  fiscal  year  IMS  I,  earlier  data  are  not  readily  available.  A 
magnetic  tape,  containing  information  pertinent  to  the  reenlistment  or  separation  of 
soldiers  during  the  period  1  Oct  SI  through  30  Sep  85,  was  provided  by  M I l.I’l:R(  l.N 
to  support  this  study.  Fxcluded  from  this  tape  were  transactions  concerning  service 
members  outside  of  the  three  SRB  /ones,  or  who  otherwise  fell  into  an  excluded 
category  as  described  in  the  previous  paragraph.  In  all,  more  than  dS  1,000  individual 
records  were  included  in  the  file. 


C.  SELECTION  OF  THE  CARRIER  VARIABLES 


1.  SRB  Level 

SRB  level  is  the  earner  variable  of  interest  in  this  study.  It  exists  at  one  of  6 
discrete  levels  for  all  MOS,  for  all  zones,  at  all  times.  These  levels  arc  0,  1.  2.  3,  4, 
and  5.  Record  of  the  SRB  history  for  each  MOS  is  not  currently  available  in  machine 
readable  form,  but  hardcopy  records  were  made  available  by  the  MILPERCEN 
program  managers  dating  back  to  197*4. 

2.  Endogenous  Variables 

The  endogenous  variables,  for  the  purposes  of  this  study,  arc  those  variables 
which  provide  information  on  the  demographic  composition  of  the  discrete  groups 
themselves.  For  each  record  contained  on  the  data  tape  provided  by  MILPERCEN. 
the  following  demographic  data  arc  recorded: 

1  AFQ7  score. 

2  civilian  education  level, 

3  sex, 

-4  number  of  dependents, 

5  race. 

It  is  our  intention  in  recording  these  data,  to  construct  variables  which  may  be 
included  in  the  overall  regression  model  to  control  for  the  clfects  of  population 
dynamics. 

3.  Exogenous  Wtriuhles 

l  nemployment  rate  is  included  as  a  statistic  which  is  visible  to  the 
recnlistmcnt  decision-maker  and  may  represent  one  quantitative  measure  of  the 
soldier's  career  alternatives.  This  data  is  readily  available  in  the  Employment  and 
Earnings  Monthly,  published  by  the  Bureau  of  labor  Statistics  (BI.S).  The  data  is 
summarized  by  occupational  classification  and  region.  Since  most  Army  skills  do  not 
readily  fall  into  any  of  the  BI.S  classifications,  our  statistic  of  choice  is  the  seasonali/ed 
aggregate  unemployment  rate. 

Consumer  Price  Index  (CPI),  as  a  measure  of  the  change  in  the  spending 
power  of  the  soldier,  is  also  considered  a  vital  statistic.  Data  is  again  available  on  a 
monthly  basis  in  the  BI.S  published  (77  detailed  Repnni.  The  statistic  most  relevant 
lor  our  uses  is  the  seasonali/ed  statistic  fur  all  urban  consumers. 

Pay  scale  changes  tire  believed  to  be  at  least  as  important  as  CPI.  Considered 
with  CPI,  a  measure  ol  the  real  change  in  a  soldier’s  purchasing  power  can  be  derived. 
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Promotion  opportunity  to  pay  grades  E5  and  E6  is  considered  very  important. 
Variables  which  account  lor  the  change  in  promotion  opportunity  at  the  MOS  level 
were  of  specific  concern  to  the  MIEPERCEN  program  managers.  Our  problem  here 
however,  is  to  identify  a  measure  visible  to  the  reenlistment  decision-maker  and  for 
which  a  reliable  historic  record  exists.  The  monthly  published  promotion  cut-olf  scores 
were  an  immediate  choice  as  an  explicit  and  simple  indicator  of  relative  promotion 
opportunity,  but  MIEPERCEN  promotion  program  managers  have  maintained  no 
data  older  than  2  years.  As  an  alternative,  it  was  decided  to  include  a  statistic  reported 
on  the  monthly  DCSPER  411,  Enlisted  Strength  Report,  available  on  microfiche  only. 
The  statistic,  mean  lime  in  service  at  promotion  for  those  promoted  in  the  previous  !2 
months,  reports  a  12  month  promotion  moving  point  average  lor  both  grades  at  the 
MOS  level.  This  statistic  is  included,  as  it  is  believed  that  a  soldier  making  a 
reenlistment  decision  is  sensitive  to  the  effects  changes  in  promotion  policy,  have  on  the 
careers  of  those  around  him. 

D.  SELECTION  OF  A  SAMPLE  PERIOD 

As  has  been  mentioned,  our  data  collection  capability  is  limited  to  the  live  fiscal 
years  from  EYSI  through  FYS5.  A  change  in  the  manner  in  which  loss  data  was 
recorded  precludes  our  obtaining  reliable  data  on  earlier  records. 

Inasmuch  as  we  plan  to  observe  outcomes  on  the  recnlistment  decision  over  a 
period  of  time  during  which  the  levels  of  the  independent  variables  included  in  our 
regression  model  are  considered  fixed,  we  must  decide  upon  a  sample  period.  An 
immediately  attractive  alternative  is  the  fiscal  quarter  for  several  reasons.  First,  the 
SRB  program  is  managed  in  accordance  with  a  quarterly  cycle.  Second,  several  of  our 
data  (such  as  the  promotion  statistics)  are  reported  at  quarterly  intervals.  1  bird, 
several  of  our  data  (such  as  CPI)  are  much  more  stable  at  the  quarter  level. 

Analysis  was  conducted  to  determine  the  appropriate  sample  si/e  of  eligibles 
required  to  ensure  that  a  reliable  base  of  MOS  and  /one  specific  retention  rate 
estimates  was  obtained.  Specifically,  we  wish  our  sample  si/e  to  be  large  enough  so 

A 

that  90%  of  the  time  our  estimate  Y  is  within  10%  of  the  true  parameter  Y.  '1  hen 
using  an  approximate  90%  confidence  interval  for  for  the  Bernoulli  parameter  Y 
[Ref.  5:  pp.  394-a'hs].  we  can  compute  the  minimum  number  of  observations,  n. 
required  to  satisfy  our  requirement.  I  he  approximate  90",.  confidence  interval  can  be 


written  as: 

A  A  /\  1  A 

Pi  Y  -  |.M5(Y  ( l-Y)  n>‘  - 


Y  "  Y 


1  .(>45t  Y  <  l-Y)  n  > 1  )  =  .90 


The  variance  of  the  estimate  is  maximized  with  Y  =  0.5. 

P(.5  -  1.645(.25  n)1  -  <  Y  <  .5  +  1.645(.25  n)1  ~)  =  .90 


We  sec  that  to  be  90%  confident  that  our  estimate  Y  is  within  10%  of  the  true 
parameter  Y,  it  must  be  true  that: 

1.645(.25  n)1  2  <  .10 


Solving  the  above  equation  for  n,  we  find  that: 


n  >  68 


We  next  require  each  MOS  included  in  our  analysis  to  have  at  least  68  /one  A 
rcenlistment  outcomes  per  quarter  for  no  fewer  than  14  of  the  20  quarters  of  data 
available.  We  will  refer  to  such  MOS  as  high  density.  In  addition  we  require  that  the 
MOS  be  authorized  as  of  the  end  of  FY85  and  that  it  have  an  active  SRB  history  in 
our  period  of  study.  That  is,  there  must  be  at  least  one  change  in  SRB  level  during  the 
data  period.  When  these  requirements  are  imposed,  the  number  of  MOS  included  in 
our  analysis  is  reduced  from  an  initial  374  to  24.  These  MOS  arc  listed  in  Table  1. 

Consider  the  SRB  budget  history  summarized  at  Figure  2.2.  While  the  number 
of  MOS  included  in  our  analysis  represents  only  6.4%  of  the  total  MOS  in  the 
inventory,  during  the  5  year  period  of  our  study,  these  24  MOS  accounted  for  over 
34%  of  the  zone  A  reenlistments  and  over  00%  of  the  total  zone  A  bonus  budget 
outlays.  With  these  facts  in  mind,  we  will  pursue  our  development  of  a  zone  A 
retention  model  using  only  the  24  high  density  MOS.  In  doing  so.  we  make  the 
following  observations: 

1  The  developed  model  should  be  accurate  for  the  24  high  density  MOS. 

2  Inasmuch  as  t lie  model  will  account  for  over  34%  of  the  total  zone  A 
rcenlistmcnts  in  the  Armv.  it  is  verv  hkelv  to  be  reasonable  accurate  for  the 
moderate  uensitv  MOS  in'  the  invenforv,  f.\  moderately  dense  MOS  is  one  for 
which  at  least  17,  hut  less  than  os,  outcomes  per  quarter  can  be  observed  for  no 
fewer  than  14  of  the  2>>  quarters  of  data  available  for  our  studv.  The  requirement 
for  17  observations  allows  us  9U"„  confidence  that  our  estimator  is  within  20",, 
of  the  true  retention  rate.  Y. )  An  application  of  the  developed  model  to  those 
MOS  will  not  be  unjustified. 

3  It  mav  not  be  possible  to  adequately  represent  the  retention  behavior  of  all  low 

densitv  MOS  with  an  overall  model.  Bv  their  nature,  thev  are  man, iced 
exception. ills .  1  heir  group  perception  of  the  factors  which  alfect  their 

leenli'  tmenf  decision  will  not  likely  be  similar  to  that  pl  ane  other  MOS  group. 
I  I  forts  to  group  these  low  density  MOS,  creatine  at  tilicialhieh  densitv  sample 
ceils.  ,js  has  been  done  in  several’ studies  bv  both  CAA  and  Rand  Corporation 
(including  those  previously  referenced),  must' be  well  documented  and  controlled. 


TAB  LI;  1 


MOS  inclided  in  this  analysis 
(HIGH  DENSITY) 

MOS  TITLE 

1 1  B  Infantryman 
11C  Indirect  Fire  Infantryman 
1 1 1 1  I  Icavy  Anti-armor  Weapon  Infantryman 
12B  Combat  Engineer 
12C  Bridge  Crewman 
13B  Cannon  Crewmember 
13E  Cannon  Eire  Direction  Control  Specialist 
1 3 E  Eire  Support  Specialist 
loR  ADA  Short  Range  Gunnery  Crew  Member 
16S  MANPADS  Crewmember 
19D  Cavalry  Scout 
I9E  M4S-.M60  Armor  Crewmember 
31 M  Multichannel  Conuno  Equip  Operator 
31\'  Tactical  C.’ommo  Equip  Operator 
51 B  Carpentry  Masonry  Specialist 
54E  NBC  Specialist 
63B  Light  Wheel  Vet:  clc  Mechanic 
0311  Track  Vehicle  Repairer 
63 N  M00ALA3  lank  System  Mechanic 
63T  Bradley  I  VS  Mechanic 
63W  Wheel  Vehicle  Repairer 
72G  Telecommunications  Center  Operator 
7oW  Petroleum  Supply  Specialist 
S2C  I  ield  Artillcrv  Surveyor 


It  is  acknowledged  here  that  our  approach  to  the  sample  size  problem  is  very 
conservative.  We  will  show  in  Chapter  IV,  that  actual  results  from  applying  our 
proposed  linear  model  to  available  data  for  high  density  MOS,  can  yield  9()% 
confidence  intervals  which  are  considerably  shorter  than  (  +  ;-)10%. 

E.  DATA  PREPARATION 

The  zone  A  SRB  level  in  effect  for  each  MOS  and  for  each  quarter  is  included  in 
the  candidate  carrier  variable  data  set  (as  variable  SRB)  without  modification.  An 
additional  variable,  SRBSQ  (SRB“)  is  also  included  to  account  for  the  possible 
nonlinear  elfects  of  the  SRB  program  on  retention. 

The  FORTRAN  code  which  was  used  to  develop  retention  rates  (response 
variable  RIIL’P)  and  other  rates  associated  with  the  endogenous  variable  set  is  included 
at  Appendix  A.  The  retention  rate  algorithm  is  straightforward  and  consistent  with  the 
rules  set  forth  in  section  B  of  this  Chapter.  The  endogenous  carrier  variables  are 
defined  for  each  of  the  24  MOS  and  for  each  of  the  20  quarters  as  follows: 

1  Af'QT  :  eligible  population  scoring  less  than  50  on  the  AFQT  ,  total  eligible. 

2  C1VED  :  eligible  population  completing  at  least  12  years  of  formal  education 
total  eligiblcf 

3  SEX  :  eligible  females  total  eligible. 

4  DTP  :  eligible  population  with  more  than  2  dependents  /  total  cligiblcs, 

5  RACE  :  eligible  non-caucasians  /  total  eligible. 

Initial  demographic  rate  definitions  were  suggested  by  retention  program  manage:  s 
MILPERCEN.  The  final  definitions  reported  above  were  developed  through  a  trial 
and  error  process.  These  definitions  were  found  to  provide  the  mon  meaningful 
description  of  an  eligible  population. 

A  variable  named  REAL  was  constructed  as  a  linear  combination  of  the  CPI  and 
the  annual  pay  raise  received  by  the  service  member.  Specifically. 
REAL  =  %  pay  raise  -  CPI.  The  variable  was  considered  as  a  carrier  because  we 
found  that  it  adequately  accounted  for  the  changes  in  the  soldier's  purchasing  power, 
while  consuming  one  fewer  model  degrees  of  freedom. 

The  L5  and  1:6  promotion  opportunity  variables  included  in  the  candidate  carrier 
variable  set  were  constructed  as  follows: 

1  E5TLST2  :  mean  time  in  service  (IIS)  at  promotion  to  erade  fa  for  those 
promoted  m  the  previous  12  months  (MOS  level)  mean  TIS  at  promotion  to 
grade  15  for  those  promoted  in  the  previous  12  months  (Army  level). 

2  1  64  LS  I  2  :  mean  I  IS  at  promotion  to  grade  16  lor  those  promoted  in  the 
previous  12  months  (MOS  level)  mean  1  IS  at  promotion  to  grade  1:6  lor 
those  promoted  m  the  previous  12  months  (Army  level). 


We  expect  to  find  that  E5  and  E6  promotion  opportunity  (here,  measured  relative  to 
an  Army  average)  are  effective  retention  incentives.  That  is,  as  the  relative  opportunity 
for  promotion  in  a  particular  MOS  is  enhanced,  so  should  the  retention  rate  be 
enhanced,  given  the  levels  of  all  other  factors  are  unchanged. 

The  seasonally  adjusted  unemployment  rate  (L'NEMPLY)  is  included  in  the 
candidate  variable  set  without  modification. 

Our  earliest  analysis  of  the  data  provided  by  MILPERCEN  indicates  the 
existance  of  a  strong  seasonal  trend  in  retention.  Figure  3.1  graphically  depicts  this 
trend.  The  solid  line  represents  the  aggregate  estimated  retention  rate  for  ail  MOS 
which  were  not  included  in  the  SRB  program  during  our  period  of  analysis.  The 
broken  line  represents  the  aggregate  estimated  retention  rate  for  all  MOS  which  were 
included  in  the  SRB  program  during  our  period  of  analysis. 

Three  observations  can  immediately  be  made.  First,  the  aggregate  trends  are 
very  similar.  Second,  despite  the  inducement  of  a  bonus,  MOS  included  in  the  SRB 
program  tend  to  have  lower  rates  of  retention  than  those  not  included.  Third,  and 
most  importantly,  it  is  evident  that  we  could  capture  a  good  deal  of  the  seasonality  by 
including  the  variables  QTR  (representing  the  actual  fiscal  quarter  associated  with  each 
data  point  and  taking  on  values  1,  2,  3  or  4)  and  QTRSQ  (  QTR"  )  in  the  candidate 
variable  data  set.  A  variable  or  set  of  variables  which  accurately  accounts  for  an  effect 
such  as  seasonality  is  preferred  to  an  explicit  representation  of  the  cause  when,  such  as 
in  our  case,  the  result  is  a  large  reduction  in  model  degrees  of  freedom. 

F.  THE  STEPWISE  REGRESSION  MODEL 

Stepwise  regression  is  a  method  of  building  a  multiple  linear  regression  mode! 
using  only  the  best  independent  carrier  variables.  In  stepwise  regression,  we  first 
construct  a  first  order  linear  regression  model  using  only  that  independent  variable 
which  is  most  highly  correlated  with  the  designated  response  variable.  We  check  the 
results  of  an  overall  F-test  to  determine  if  our  regression  is  significant  at  some 
pre-selected  level.  If  not,  we  discontinue  our  analysis  and  select  ^  =  V  as  our  best 
predictor.  Otherwise,  we  retain  that  initial  variable  in  our  model  and  search  for  a 
second  significant  carrier  variable  to  enter  the  regression.  The  partial  correlations  of 
each  of  the  remaining  candidate  carrier  variables  with  the  response  variable  arc- 
examined  and  the  variable  with  the  highest  partial  correlation  is  added  to  the 
regression.  The  partial  E-statistics  of  each  carrier  variable  included  in  the  model  are 
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Figure  3.1  The  Seasonality  of  Retention 
(Bonus  and  Non-bonus  MUS) 


examined  and  compared  to  a  prc-sclcctcd  acceptance  level.  IT  they  are  both  significant, 
they  arc  retained  and  a  third  candidate  carrier  is  proposed.  Otherwise,  we  eliminate  the 
non-significant  carrier(s)  from  the  regression  model  and  identify  the  next  best 
candidate.  This  process  is  continued  until  the  set  of  variables  included  in  the  model 
cannot  be  altered  at  the  pre-selected  significance  level.  |Ref.  4:  pp.  306,  312|. 

The  correlation  matrix  of  the  response  variable  and  each  of  the  candidate  carrier 
variables  for  all  data  in  our  data  set  (24  MOS  x  20  observations  per  MOS  =  4SO 
observations)  is  at  Appendix  B.  Note  that  the  variable  SRB  is  more  highly  correlated 
with  the  response  variable  RFl'P  than  any  other,  ’t  here  do  not  appear  to  be  any 
dangerous  correlations  among  the  candidate  carriers  at  the  aggregate  level.  Recall,  we 
wish  to  uuard  aeainst  any  singularity  or  near  singularity  of  the  X  X  matrix. 


An  example  of  an  input  data  set  lor  MOS  63B  is  at  Appendix  C.  Note  that 
variables  SRBSQ  and  QTRSQ  do  not  appear,  as  they  arc  constructed  in  the  modelling 
process.  An  example  of  the  output  from  a  SAS  STEPWISE  procedure  is  at  Appendix 
D.  Precise  instructions  for  interpreting  this  output  are  contained  in  |Rcf.  6]  and 
|  Ref.  7:  pp.  761-774).  The  SAS  commands  which  were  used  to  generate  this  output  are 
included  in  Appendix  G. 

G.  RESULTS  OF  THE  STEPWISE  ANALYSIS 

We  summarize  the  results  of  our  stepwise  analysis  in  three  ways.  First,  we 
examine  the  results  of  each  regression  to  determine  which  carrier  variables  had 
estimated  regression  coefficients  which  were  reasonably  and  consistently  signed  and 
significant  at  the  .15  acceptance  level  most  often.  Then,  as  a  measure  of  the  total 
variation  in  retention  rate  explained  by  our  model,  we  examine  the  R“  statistic  for  all 
MOS  included  in  our  analysis.  Finally  as  a  measure  of  goodness  of  fit,  we  examine 
Mallows  Cp  statistic  for  all  MOS  included  in  our  analysis.  After  we  have  proposed 
and  applied  an  overall  model,  a  more  detailed  analysis  of  model  residuals  is  presented 
in  Chapter  I V. 

1.  Significant  Carriers 

In  Table  2,  each  candidate  carrier  variable  is  listed.  The  pair  SRB*  .'  SRBSQ" 
and  tiie  pair  QTR*  QTRSQ*  are  also  included  and  will  be  used  to  record  the  etent 
that  both  carriers  were  considered  significant  for  a  particular  MOS.  For  example,  if 
SRB  and  SRBSQ  are  both  included  for  some  MOS,  an  observation  will  not  be  recorded 
for  the  carriers  SRB  and  SRBSQ.  Instead  an  observation  will  be  recorded  for  both 
SRB*  and  SRBSQ*.  Observations  for  SRB  and  SRBSQ  (or  QIR  and  QTRSQ)  are 
recorded  only  when  they  arc  un-paired.  An  observation  for  any  candidate  variable  is 


recorded  when  the  variable  has  been  included  included  in  the  stepwise  model  at  the  .15 
level  of  significance.  The  manner  of  record  chosen  (  +  /  -  )  indicates  the  sign  of  the 
estimated  coefficient. 

We  note  in  Table  2  that  the  SRB*  /  SRBSQ*  pair  is  not  often  significant  while 
the  QIR*  /  QTRSQ*  pair  is.  However,  we  also  note  that  the  variables  SRB  or 
SRBSQ,  or  their  pair,  are  considered  significant  in  17  of  the  24  individual  models 
examined.  Other  variables  which  appear  to  be  excellent  carrier  candidates  are  RACE, 
l)EP  and  REAL. 
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TABLE  2 


SIGNIFICANCE  OF  CARRIERS  (STEPWISE  PROCEDURE) 
(0.15  SIGNIFICANCE  LEVEL) 

CARRIER  RESULTS 

SRB  +++++++++++ 

SRBSQ  +  +  +  +  + 

SRB*  + 

SRBSQ* 

QTR 

I  QTRSQ  -  -  - 

I  QTR*  +  +  +  +  -  +  +  +  + 

QTRSQ* 

RACE  ++++++++ 

j  DEP  +--+++++++++ 

j  EDUCATE  - 

AFQT  +  +  +  -  - 

E5TEST2  +  +  +  +  -- 

E6TEST2  +  +  +  - 


UNEMPLY  +  +  +  + 

REAL  +++_+++++++ 


2.  I  hr  R~  Statistic 

A  commonly  accepted  statistic  for  measuring  the  value  of  a  regression 
equation  is  the  R“  statistic.  The  R  statistic  actually  measures  the  proportion  ol  total 
variation  about  the  mean.  Y.  which  is  accounted  lor  by  the  regression.  We  are 
cautious  in  using  this  statistic,  because  it  can  be  made  arbitrarily  high  by  adding 
diHerent,  albeit  meaningless  carriers  |  Re! .  -1:  p.  33 1. 

A 

With  this  caution  in  mind,  the  results  of  our  R~  analysis  are  summarized  in 
Figure  3.2.  T  he  horizontal  avis  is  grouped  into  R"  bins  of  width  0.1.  while  the  vertical 
a\is  represents  the  number  of'  occurances. 


2') 


3.  The  Mallows  Cn  Statistic 

Another  popular  statistic  for  measuring  the  goodness  of  fit  for  a  proposed 
model  is  the  C  statistic  developed  by  C.  L.  Mallows  [Ref.  4:  pp.  299,  3U.3|.  The 
expected  value  of  the  statistic  is  approximately  the  number  of  independent  carriers 
included  in  the  regression  model  plus  the  intercept  term  (p).  Extraordinarily  high 
values  of  the  Cp  statistic  indicate  that  our  model  suffers  considerably  from  lack  of  lit: 
that  is.  our  residuals  are  composed  of  both  random  and  systematic  components.  In 
our  analysis  of  the  given  data,  we  find  that  three  of  the  proposed  regression  models 
obtained  via  the  stepwise  procedure  suffer  from  lack  of  fit.  They  are  the  models 
associated  with  the  MOS  listed  in  Table  3.  Wc  will  pay  particular  attention  to  these 
MOS  in  attempting  to  lit  an  overall  model. 

H.  THE  PROPOSED  OVERALL  MODEL 

The  proposed  overall  model,  based  on  the  requirements  of  the  study  and  the 
previous  analysis,  is  as  follows: 

v  =  P0  +  PjX,  +  |i2X2  +  P3X22  (3T) 

+  fJ4X3  +  P5X4  +  P6X5  +  c 

where: 

Y  =  retention  rate  (as  previously  defined) 

Xj  =  SRB 
X2  =  QTR 
X3  =  RACE 
X4  =  DEP 
X5  =  REAL 

c  =  error  component  with  assumed  distribution  N(  0,  a2  ) 


and  |1  is  a  vector  oT  the  parameters  to  be  estimated. 
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Figure  3.2  Distribution  of  Values 
(Stepwise  Procedure) 
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TABLE  3 

I  ACR  OF  Ill  MODELS 
( FROM  THE  S1EPWJSE  PROCEDURE) 


MOS 


C  Statistic 


P 


IV.  THE  ZONE  A  RETENTION  MODEL 


In  this  Chapter,  ordinary  least  squares  multiple  linear  regression  analysis  is  used 
to  fit  the  overall  model  proposed  in  Chapter  III  to  the  data  available  for  the  high 
density  MOS.  The  results  of  this  analysis  arc  discussed  in  terms  of  carrier  significance 
and  the  R~  statistic.  An  examination  of  the  residuals  is  performed  to  investigate 
suspected  model  inadequacies.  The  model  is  then  fit  to  data  available  for  the  moderate 
density  MOS.  The  results  of  this  analysis  arc  briellv  summarized  and  potential  data 
transformations  are  discussed.  A  demonstration  of  the  uses  of  this  model  in  both  a 
predictive  and  comparative  mode  is  presented.  Finally,  alternatives  for  modelling  low 
density  MOS  are  suggested. 

A.  THE  OVERALL  MODEL  I TITL'D  TO  HIGH  DENSITY  MOS 

The  overall  model,  as  proposed  in  the  previous  Chapter,  is  as  follows: 

Y  =  |5q  +  PjXj  +  P2X2  +  PyX2-  (4.1) 

+  (J4X3  +-  P5X4  +  P6X5  +  c 

where: 

'i  =  retention  rate  (as  previously  defined) 

X,  =  SRB 
X2  =  Qi  R 
X3  =  RACE 
X4  =  DEP 
X^  =  REAL 

A 

c  ~  error  component  with  assumed  distribution  N(  O.  ) 

and  jl  is  a  vector  of  the  parameters  to  be  estimated. 

!n  applying  ordinary  least  squares  linear  regression  analysis  to  our  data,  we 
recognize  that  we  have  2<>  unadjusted  degrees  of  freedom  (df)  available  for  each  MOS 
<  via  our  2> >  quarterly  observations  on  the  response  and  carrier  variables).  Our 
proposed  model  requires  I  df  for  the  intercept  estimate.  bt).  and  (>  dffor  the  proposed 
san.er  '.  a!  dues,  leasme  !.l  df  for  error.  While  no  hard  and  last  rules  exist  for  the 


optimal  distribution  of  available  df  in  the  development  of  a  linear  model,  a  good  rule  is 
to  keep  the  model  degrees  of  freedom  (in  our  case,  1)  small  relative  to  the  total 
available  degrees  of  freedom.  This  is  a  particularly  good  rule  when  the  model  degrees 
of  freedom  are  limited,  as  they  are  in  our  analysis. 

The  proposed  overall  model  was  fitted  to  the  data  available  for  the  24  high 
density  MOS.  The  SAS  commands  which  were  used  to  generate  our  output  are 
included  at  Appendix  G.  A  copy  of  our  output  for  example  MOS  63B  is  at  Appendix 
E. 

We  can  easily  summarize  our  results  of  this  analysis  in  a  manner  similar  to  that 
used  for  our  stepwise  analysis  in  the  previous  Chapter.  First,  we  examine  the  estimated 
coefficients  of  each  carrier  for  each  MOS'  to  determine  which  were  most  often 
consistent  and  most  often  significant.  We  note  that  our  results  for  the  included  carriers 
may  well  differ  from  the  results  we  obtained  for  those  same  carriers  in  our  stepwise 
procedure.  Despite  our  efforts  to  select  candidate  carriers  which  were  unrelated,  it  is 
very  possible  that  for  a  particular  MOS.  a  carrier  which  was  included  (A!  QI.  for 
example >  in  the  stepwise  model  served  as  a  proxy  jRef.  3:  p.  317)  for  some  carrier  which 
was  not  included  (say,  DTP).  Since  DEP  is  included  in  the  overall  model,  and  AI'QT 
is  not,  it  would  not  be  surprising  if  DEP  were  to  suddenly  become  significant  at  the  .15 
level  in  our  current  analysis,  even  though  it  was  rejected  at  that  same  level  in  our 
stepwise  analysis.  This  phenomenon  is  a  consequence  of  our  resolve  to  develop  an 
os erall  model. 

After  our  estimated  coefficient  analysis,  we  will  present  an  R~  statistic  summary, 
similar  to  that  presented  in  Chapter  Ill. 

1.  Siynijlciiiil  Carriers 

In  Table  4.  a  summary  of  the  results  in  terms  of  significant  carriers  using 
ordinary  least  squares  multiple  regression  analysis  is  presented.  The  same  definitions 
for  QTR*  and  QTRSQ*  apply  as  in  Chapter  III;  that  is,  they  represent  paired 
observations  on  the  variables  QTR  and  QTRSQ.  We  notice  that  our  results  from  this 
analysis  arc  very  similar  to  those  summarized  at  Table  2  for  the  stepwise  analysis  for 
all  variables  except  REA I..  Previously,  REAL  was  significant  at  the  .15  acceptance 
level  a  total  of  IP  times.  In  our  current  analysis,  it  is  significant  IS  times,  or  as  many 
times  as  the  variable  SRI3  is  significant. 

At  the  individual  MOS  level,  we  can  compare  our  output  for  MOS  (All  via  the 
stepwise  procedure  (Appendix  O;  m  the  output  generated  when  the  overall  model  was 


TABLE  4 


SIGNIFICANCE  OF  CARRIERS  (REGRESSION  PROCEDURE) 
;  (0.15  SIGNIFICANCE  LEVEL) 

,  CARRIER  RESULTS 

SRB  ++++++++++++++++++ 

I 

QTR 

QTRSQ  ----- 

QTR*  ++++++++++++ 

|  QTRSQ*  ------------ 

RACE  +  +  +  +  +  +  +  +  + 

DE?  +++++++ 

REAL  +  +  +  +  +  +  +  +  +  +  +  +  +  +  +  +  +  + 


littcd  (Appendix  E).  We  note  that  the  carriers  which  were  considered  significant  via 
the  stepwise  analysis,  and  which  were  also  included  in  the  overall  model,  remain 
significant.  Carrier  variables  DTP  and  REAL,  which  were  not  considered  significant 
via  the  stepwise  procedure,  are  also  not  significant  at  the  .15  level  in  our  current 
analysis,  although  their  estimated  regression  coellkients  are  signed  as  expected.  The 
general  effect  of  using  an  overall  model,  vice  an  MOS  specific  model,  in  this  case  is  not 
great.  Hie  R“  statistic  has  been  reduced  from  .03  to  ,S7,  and  the  overall  significance 
level  of  the  regression  has  been  slightly  increased,  owing  to  a  slightly  larger  error  mean 
square  value. 

Note  that  a  critical  point  made  earlier  in  this  thesis  is  supported  by  our 
current  analysis.  T  he  estimate  of  an  individual  regression  coefficient  is  dependent,  in 
varying  degrees,  on  it's  costock,  flic  estimate  b|.  with  costock  including  15  11  S  !  3 
and  l  NEMPLY  (via  the  stepwise  procedure)  is  valued  at  .255.  With  I  N  IIS  I  E  and 
LNLMPLY  removed,  and  with  DIP  and  REAL  included,  the  estimate  bj  is  increased 
to  .304.  While  this  difference  may  seem  slight  (and  n  with  respect  to  the  standard 
error  of  the  estimate),  it  could  be  a  very  significant  difference  if  this  coefficient  is  used 
as  a  point  estimate  of  the  effectiveness  factor  (as  discussed  earlier). 


Figure  *1.1  Distribution  of  R“  Value1; 
(Regression  Procedure  -  High  Densitv  \  I  OS ) 


2.  The  I\~  Statistic 


In  Figure  4.1.  we  note  that  our  lowest  observed  R~  value  is  in  the  .65  bin. 
whereas  in  our  stepwise  summary  at  Future  5.2.  it  was  in  the  .45  bin  (an  improvement 
in  the  distribution  of  the  R~  \  abacs  i.  W  e  note  also  that  the  number  of  observations  in 
the  .95  bin  has  been  reduced  from  4  in  Figure  3.2  to  1  in  Figure  4.1.  W'e  have 
examined  a  case  in  the  previous  section  wherein  the  R~  value  moved  Irom  the  .95  bin 
to  the  .S5  bin  iMOS  6315).  MOS  16R  is  an  example  of  an  MOS  which  moved  from  the 
.45  bin  to  the  .65  bin  in  our  analysis. 

Idle  actual  difference  in  R~  values  lor  MOS  10R  is  .60  -  .48  =  .12.  In  the 
stepwise  procedure,  only  OFF  and  QTRSQ  were  included  as  significant  carriers  (at  the 
.15  level  of  acceptance).  W  hen  the  overall  model  was  fitted  to  the  data  available  for 
MOS  loR,  the  other  4  carrier  sariables  were  not  significant  at  the  .15  acceptance  level, 
but  a!!  were  signed  as  we  expect,  and  some  variables,  such  as  SRI),  were  significant  at 
only  slightly  higher  levels.  In  Mi.  while  the  R~  statistic  was  increased  for  this  MOS. 
and  the  sum  of  suuarcs  due  to  regression  was  increased,  the  overall  significance  of  the 
regression  was  slightly  reduced  by  the  inclusion  of  the  non-sigiiijlcmi  carrier  terms. 

B.  MX  AM  1  NATION  OF  RMS  1  DUALS 

Our  residual  analysis  associated  with  fitting  the  proposed  overall  model  to  the 
data  available  for  the  24  high  density  MOS  is  summarized  in  the  4  graphics  below. 
The  residuals  of  the  24  MOS  were  examined  independently  during  the  analysis  phase  of 
this  study,  but  tire  here  presented  in  an  aggregate  manner  with  enhanced  effect. 

In  conducting  a  residual  analysis,  we  tire  examining  the  validity  of  the  model 
assumptions  concerning  the  observed  errors;  that  is,  that  they  are  independent,  have  a 
0  mean,  have  a  constant  variance,  and  follow  a  normal  distribution.  At  the  conclusion 
of  our  analysis,  we  should  observe  that  either  our  model  assumptions  appear  to  be 
violated  or  they  do  not  appear  so.  [Ref.  4:  pp.  141-142). 

1 .  The  I'rcqucncy  Plot 

In  f  igure  4.2,  we  present  a  horizontal  bar  chart  of  the  residuals,  from 
-.3  to  -f  .3  in  bins  of  width  .(>1.  flic  distribution  of  these  residuals  shoulJ  appear 
symmetric  (specifically,  beil  shaped),  and  centered  on  0.  No  contradiction  to  our 
normality  assumption  is  evident  here. 


feX 
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2.  The  Plot  against  Pitied  l 'allies 

In  Figure  4.3.  we  present  a  plot  ol'  the  residuals  verses  the  fitted  values 
associated  with  them.  We  hope  to  find  no  regular  pattern  in  the  residuals;  that  is,  if 
our  model  assumptions  are  correct,  the  distribution  of  the  residuals  is  independent  of 
the  fitted  values.  No  contradiction  to  this  assumption  is  evident. 

3.  The  Plot  against  Time  Sei/uence 

As  in  the  plot  against  the  fitted  values,  we  should  observe  no  patterns  of 
significance  in  the  plot  of  residuals  verses  sequence  of  observation.  In  Figure  4.4,  w  hile 
vve  note  a  tendency  for  positive  valued  residuals  associated  with  observations  6  and  ‘K 
they  are  not  abnormally  low  or  high  and  no  regular  patterns  are  discernable. 

4.  The  Serial  Correlation  Plots 

In  Figures  4.5  and  4.6,  we  test  for  Fag-1  and  I.ag-4  serial  correlation 
respectively.  If  our  observed  errors  are  pairwise  uncorrelated,  then  a  < 'loin/  centered  on 
coordinate  fih  O')  should  be  the  only  discernable  pattern.  The  I  ag-J  plot  is  suggested 
by  our  suspicion  that  some  seasonality  effects  remain,  even  after  the  addition  of  the 
QTR  and  Q f RSQ  variables  to  our  overall  model.  It  is  seen  that  our  suspicions  ate 
unfounded. 

With  these  results  in  hand,  we  arc  prepared  to  accept  our  modelling 
assumptions  as  reasonable.  The  SAS  commands  which  were  used  to  produce  all  the 
previous  residual  graphics  are  included  tit  Appendix  (F 

C.  IT  IF  OVFRAFF  MODFL  F  i  l  l  1.1)  TO  MODFRAI  F  DFNSFIA  MON 

We  now  have  an  opportunity  to  verily  our  proposed  overall  mode!  with  a  Iresh 
data  set.  From  among  the  remaining  MOS,  we  selected  50  moderate  densitv  MON  Or 
which  we  have  record  of  an  active  SRB  history  during  the  fiscal  years  |oy  M-)x\  I  Fu 
for  these  MON  were  gathered  in  the  same  manner  as  for  the  24  high  demo;.  MON. 
lhc  proposed  linear  model  was  fitted  to  these  data  and  the  results  hem  the 
independent  fittings  are  summarized,  in  the  aggregate,  as  follows: 

1.  Signi/h  ant  Carriers 

The  primary  carrier  variable  of  interest.  SRB,  continues  to  ••e:ve  ■  v. 
excellent  predictor  variable.  In  our  current  analysis,  it  is  sign-die  in'  1' 

acceptance  level  m  2"  ol  the  50  moderate  density  models.  The  p  ur  O  |  R  a  a  «.» !  ’.'.so 
were  also  inc'uded.  as  significant  in  2”  ol  5n  cases.  I  he  earners  R  \(  I  .  R I  M  :• 

1)1  i’  were  not  eonwdered  to  be  as  s;c  nllc.mt  as  often  1  14.  14  a:.,;  11  :  . 
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respectively),  but  their  estimated  coefficients  were  consistently  signed  (always  positive) 
and  were  frequently  significant  at  levels  just  above  the  .15  acceptance  threshold. 
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figure  4,7  Distribution  of  R“  Values 
(Regression  Procedure  -  Moderate  Density  MOS) 

2.  The  R~  Statistic 

At  figure  4.7,  the  distribution  of  R  values,  obtained  from  fitting  the 
proposed  overall  model  to  the  data  available  for  the  50  moderate  density  MOS.  is 
plotted,  as  previously,  with  a  bar  chart.  Two  points  arc  worth  noting  with  respect  to 
figure  4.7.  first,  as  meusuted  in  terms  of  the  RA  statistic,  our  proposed  o\  erall  model 
continues  to  serve  us  well  in  explaining  the  variation  in  retention  rate  through  time  at 
the  MOS  level.  Second,  the  distribution  of  observations  on  the  R2  statistic  for 
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moderate  density  MOS  seems  to  be  more  highly  spread  than  the  R"  distribution  for 
high  density  MOS.  This  phenomenon  is  not  unexpected  when  the  smaller  sample  si/es 
associated  with  the  moderate  density  MOS  are  considered.  If  our  proposed  overall 
model  is  correct,  the  decreased  level  of  precision  with  which  we  can  measure  outcomes 

A 

on  the  response  variable,  V,  will  cause  a  general  increase  in  the  variability  of  the  R- 
statistic,  and  a  general  decrease  in  it's  mean  value. 

Our  error  term  £  in  the  overall  model  actually  accounts  for  the  simultaneous 
effect  of  errors  from  several  sources.  The  first,  and  most  obvious  source,  is  our 
inability  to  know  or  measure  all  factors  which  arc  critical  to  the  rccnlistmcnt  decision 
for  all  soldiers.  A  second  significant  source  is  our  inability  to  measure  the  true 
response  variable,  V.  Recall,  we  estimate  the  zone  A  retention  rate  ol"  a  particular 
MOS  for  a  particular  quarter  with: 

A 

Y  =  number  of  soldiers  reerdisting  for  their  own  A!  (AS  '  number  ol  soldiers  eligible 
tO  do  SO. 

We  have  shown  that  the  variance  of  the  estimate  generally  increases  with  decreasing 
sample  size.  However  for  a  particular  MOS,  if  the  general  size  oT  the  sample  can  be 
considered  stable  in  our  period  of  study,  then  this  measurement  error  is  simply 
absorbed  in  the  error  term  c.  without  cdcct  on  the  modelling  assumptions.  To  the 
extent  that  the  R‘  statistic  can  be  thought  of  as  the  ratio  of  the  variation  in  the  data 
around  Y  explained  by  the  regression,  to  total  variation  in  the  data  around  Y  (which 

A 

includes  the  variation  accounted  for  by  the  error  term),  the  decrease  in  the  mean  R“ 
outcome,  and  increase  in  variability,  arc  expected  lor  the  lower  density  MOS.  [Ref.  S: 
pp.  93-94j. 

3.  Residual  Analysis 

An  extensive  analysis  of  aggregate  residual  plots  is  not  presented  here  because 
the  remits  are  very  similar  to  the  results  we  obtained  when  the  overall  model  was  luted 
to  the  data  for  high  density  MOS.  One  plot  which  is  worthy  of  note  however,  is  the 
plot  of  residuals  vs.  sequence  of  observation  at  figure  4.S.  In  our  earlier  analysis  of 
residuals  for  high  density  MOS,  wc  noted  that  residuals  for  quarters  6  and  9  appeared 
to  be  skewed  positive.  We  note  that  for  residuals  associated  with  fitting  the  overall 
model  to  data  for  the  50  moderate  density  MOS,  this  perceived  skewing  is  not 
apparent.  1  ins  observation  lessens  our  concern  that  our  error  term  contains  systematic 
and  biasing  components. 
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is  considered  because  proportional  type  data  typically  do  not  have  a  uniform  variance; 
the  estimated  variance  of  the  data  is  dependent  on  the  rate  itself.  However,  these 
transformation^  are  not  used  when  the  value  of  the  estimated  rates  arc  in  the  range 
i  0.3  -  0.”  ).  In  this  range,  the  most  common  variance  stabilizing  transformations  are 
nearly  linear,  and  the  dependence  of  the  sample  variance  on  the  estimated  rate  is 
minimal.  In  the  graph  at  Figure  4.3,  we  see  no  evidence  which  warrants  a  variance 
stabilizing  transformation  of  our  data.  The  overall  model  without  transformation  is 
believed  best  suited  to  the  needs  of  our  intended  user. 

F.  A  DEMONSTRATION  OF  MODEL  USE 

We  have  shown  in  Chapter  III  that,  given  our  model  is  correct,  a  prediction  of  Y 
at  X()  is  given  by: 

%  =  bX0  (4-:> 
with  variance  given  by: 

V(Y0)  =  Xu  (X'X)'1  X()  <XN  (4.3) 

Using  the  error  mean  square  term  as  our  best  estimate  of  cr-,  we  can  construct  a  90% 
confidence  interval  for  the  true  mean  value  ofY  at  Xq  as  follows: 

Y0  (  +  /-)  1  -77 1  (  s  )(  X0  (  X'X  )‘l  X0  ),/2  (4.4) 

where  s  represents  the  square  root  of  error  mean  square. 

To  demonstrate  the  use  of  this  model,  we  have  arbitrarily  selected  MOS  1111  for 
the  purpose  of  conducting  sensitivity  analysis.  The  value  of  the  R“  statistic  when  the 
overall  model  was  fitted  is  .7283,  and  the  variables  QTRSQ.  SRB,  and  REAL  are 
significant  at  the  .15  level  of  acceptance.  Analysis  of  the  residuals  reveals  no 
significant  departure  from  normality. 

To  perform  our  analysis,  we  again  resort  to  the  SAS  statistical  software  package. 
I  he  (  X'X  )'*  matrix,  the  estimated  regression  coefficients  and  the  error  mean  square, 
calculated  using  PROC  REG.  were  printed  to  an  output  file.  The  computational 
formulas  shown  in  equations  4.2,  4.3,  and  4.4  were  added  to  this  file,  and  it  was 


prepared  as  an  input  file  to  the  SAS  PROC  MATRIX  routine.  Copies  oi  th.e  input 
hies  and  output  files  involved  in  this  procedure  are  at  Appendix  F. 

The  7  dimensional  vector  of  values  on  the  independent  variables  at  which  we 
wish  to  predict  an  outcome  for  the  dependent  variable  Y  is  represented  by  X().  Let  us 
hvpothesize  an  X()  value  of(l.  2,  4,  0.  n.4.  0.25.  3.0).  where  the  first  position  of  the 
vector  is  reserved  for  the  unity  multiplier  of  the  intercept  term  and  the  remaining 
values  represent  outcomes  on  the  independent  variables  QTR,  QTRSQ,  SRB.  RACE, 
DLiP,  and  REAL  respectively.  The  90%  confidence  interval  on  the  true  mean  value  of 
V  at  Xq  are  shown  on  the  first  line  at  Table  4.  The  90%  confidence  intervals  on  the 
true  mean  value  of  Y  at  X()  when  the  hypothesized  value  of  SRB  is  changed  to  levels  1. 
2,  and  3  are  shown  on  lines  2,  3.  and  4  of  Table  4  respectively. 


TABLE  5 

SENSITIVITY  ANALYSIS  I  OR  MOS  1111 


1  SRB  Level 

.  90  L3 

A 

Yo 

.  90  UB 

s.  e.  ( Predict ) 

0 

.  301 

.  426 

.  551 

.  0707 

i  1 

.  376 

.  475 

.  574 

.  0561 

!  2 

.  433 

.  524 

.  615 

.  0511 

3 

.  469 

.  573 

.  677 

.  0585 

In  the  results  summarized  at  Table  4.  we  observe  two  phenomena.  First,  and  as 
expected,  the  value  of  Yq  increases  at  a  steady  rate  of  .049  with  each  unit  increase  in 
SRB  level  (.<>49  is  the  value  of  bj).  Second,  and  more  importantly  however,  note  the 
behavior  of  the  standard  error  of  the  prediction  (  s.e.  (Predict)  -  the  square  root  of  our 

A  A 

V(Y())  term).  It  decreases  through  SRB  level  2  and  increases  thereafter.  This  behavior 
is  the  result  cfour  moving  closer  to  the  center  of  the  sample  data  space.  As  we  move 
further  from  the  center  of  the  sample  data  space,  reliance  on  a  point  estimate  for  the 
response  variable  is  increasingly  dangerous.  If  we  attempt  to  extrapolate  beyond  our 
sample  data  space,  we  can  have  very  little  confidence  in  the  validity  of  our  point 
prediction  j  Ref  4:  p.  X|. 


We  note  also  that  the  widths  of  the  90%  confidence  intervals  defined  above  can 

a 

be  approximately  represented  as  Y()  (+■'-)  10%.  When  MOS  for  which  the  overall 
model  provided  a  better  lit  were  considered  (such  as  MOS  G3B),  these  confidence 

A 

intervals  were  more  nearly  approximated  by  Y()  (  + ,  -  )  3%. 

It  is  not  a  simple  matter  to  measure  6  dimensional  data  spaces,  for  our  uses 
however,  it  is  a  simple  enough  matter  to  ensure  that  any  sensitivity  analysis  conducted 
with  respect  to  any  particular  independent  variable,  or  combination  of  independent 
variables,  remains  in  tire  range  of  values  defined  by  the  sample  data  space  for  those 
variables.  In  general,  when  the  the  independent  variables  are  unrelated,  the  bulk  of  the 
potential  problems  associated  with  prediction  are  avoided  if  the  sensitivity  analysis  is 
conducted  within  the  individual  value  ranges  of  the  independent  variables. 

We  must  be  particularly  careful  when  the  estimated  coellicient  of  any  carrier 
variable,  such  as  SRB.  is  interpreted  as  the  effect  of  varying  the  level  of  the  associated 
variable  while  the  other  values  are  unchanged,  liven  when  that  variable  is  unrelated  to 
it's  costock,  the  range  of  values  for  which  such  an  interpretation  is  valid,  as  described 
by  the  sample  data  space,  should  be  respected.  This  is  best  shown  by  example. 

At  Appendix  Li,  we  have  examined  the  model  parameter  estimates  for  MOS  63 B. 
We  note  that  the  coellicient  for  carrier  variable  SRB  is  estimated  as  .30-1.  This 
estimate  is  based  on  a  sample  data  range  of  (0,  1)  for  the  variable  SRB.  Clearly,  it  is 
not  reasonable  to  use  this  estimate  as  an  effectiveness  coellicient  at  SRB  levels  2,  3  or 
higher  (implying  a  60%.  90%,  or  higher  increase  in  retention  rate  over  the  SRB  level  o 
rate).  Alternatives  for  prediction  and  comparison  when  we  wish  to  extrapolate  beyond 
our  data  space  arc  described  in  the  next  section. 

F.  A  LIT  R\ ATT  MODLLLING  ST 'RAT TOILS 

In  developing  the  overall  model,  we  considered  a  data  base  representing  24  high 
density  MOS,  which  were  authorized  as  of  30  September  1985,  and  for  which  an  active 
SRB  history'  existed  during  our  period  of  analysis.  Wc  then  fit  the  proposed  overall 
model  to  50  moderate  density  MOS  with  active  SRB  histories  to  verify  our  modelling 
assumptions.  Based  on  our  preceding  analysis,  wc  propose  that  the  overall  model  be 
extended  for  general  use  in  explaining  the  variation  in  zone  A  retention  behavior  for  all 
MOS.  We  acknowledge  however,  that  as  the  density  associated  with  an  MOS 
decreases,  so  docs  our  ability  to  maintain  small  confidence  intervals  about  our 
parameter  and  prediction  estimates.  As  stated  earlier,  this  is  a  consequence  of 


including  the  additional  imprecision  associated  with  our  measurement  of  Y  in  the  error 
term  c.  If,  in  our  examination  of  residuals  however,  we  find  no  reason  to  discount  our 
modelling  assumptions,  and  no  intuitive  reason  exists  to  discount  these  assumptions, 
then  there  is  no  reason  to  believe  a  better  model  exists. 

In  the  event  that  the  model  sull'ers  grossly  from  lack  of  lit,  or  other  factors  exist 
which  cast  doubt  on  the  applicability  of  the  model  to  a  particular  MOS,  use  ol’  this 
model  in  a  predictive  procedure  for  that  MOS  is  not  advised.  This  situation  is  most 
likely  to  occur  in  fitting  the  model  to  data  associated  with  very  low  density,  highly 
technical  MOS.  In  such  a  case,  it  is  advisable  to  construct  and  maintain  an  MOS 
specific  predictive  model.  Any  intcr-MOS  comparison  of  the  estimated  coefficients  of 
like  carrier  variables  should  not  include  these  unique  specialties. 

Suggestions  for  using  the  developed  overall  model  under  extraordinary 
e ;  rc  u  rn  s  t  a  nee  s  1  b  1 1  o  w. 

1.  Modelling  a  new  MOS 

Typically,  when  a  new  MOS  is  introduced,  personnel  aie  reclassified  from 
some  other  specialty,  which  is  in  turn  reduced  in  si/e  or  eliminated.  A  pseudo-historic 
data  base  for  the  new  MOS  can  then  be  created  by  including  the  records  of  the 
individual  reenlistmcnt  decisions  and  SRB  histories  applicable  to  soldiers  in  the  losing 
MOS. 

2.  Modelling  a  Low  Density  MOS 

When  the  sample  sizes  involved  in  a  very  low  density  MOS  are  so  small  that 
acceptabiely  reliable  estimates  of  the  regression  parameters  cannot  be  attained,  but  the 
model  is  believed  adequate,  then  it  is  recommended  that  the  estimated  coellicients  of  a 
like  MOS,  for  which  an  adequate  sample  si/e  is  available,  be  used  in  retention  rate 
prediction.  This  alternative  is  suggested  in  preference  to  eroupine  these  low  density 
MOS  for  two  reasons.  First,  an  explicit  decision  is  made  by  the  SRB  program 
manager,  as  to  which  MOS  can  best  represent  the  MOS  of  concern  in  retention  rale 
projection.  With  the  group  method,  we  average  the  effects  of  several  MOS.  It  is 
intuitive  that  our  results  with  a  single  most  similar  MOS  should  be  better.  Second,  we 
need  not  develop  imaginative  ways  to  group  MOS  unique  factors,  such  as  SRB  level, 
across  many  MOS. 

3.  l.xirapolating  Beyond  the  Sample  Data  Spa <r. 

If  the  extrapolation  is  not  too  distant  from  the  sample  data  space  and  docs 
not  involve  extrapolating  the  SRB  level,  then  it  is  iccommendcd  that  we  use  the 
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developed  model  without  modification,  making  dear  our  concern  over  the  increasing 
danger  of  using  a  point  prediction.  If  the  extrapolation  does  involve  the  SRB  variable, 
or  the  extrapolation  is  far  beyond  the  data  space  described  by  our  available  data  in  any 
dimension,  then  selection  of  a  like  MOS,  with  a  data  space  accommodating  our  needs, 
is  recommended  for  use  in  analvsis. 


V.  CONCLUSIONS  AND  RECOMMENDATIONS 


In  this  thesis,  the  problem  of  developing  a  predictive  model  which  explains  the 
variation  in  zone  A  enlisted  retention  rates  at  the  MOS  level  is  formulated  and  solved 
using  stepwise  and  ordinary  least  squares  linear  regression  analysis.  Inasmuch  as  the 
principle  use  of  this  model  will  be  in  the  management  of  the  SRB  program,  SRB  level 
was  initially  included  as  a  candidate  carrier  variable.  Two  other  categories  of  candidate 
carrier  variables  were  also  included.  The  endogenous  variables  represent  a  demographic 
profile  of  an  eligible  reenlistment  population.  The  exogenous  variables  represent  the 
alternate  career  opportunities  as  perceived  by  the  rcenlistment  decision-maker.  This 
approach  represents  a  significant  improvement  ever  earlier  elforts  to  solve  this 
problem,  in  that  a  capability  to  include  a  demographic  profile  of  the  eligible 
populations  was  not  previously  available  to  tiie  analyst. 

To  allow  for  the  intcr-MOS  comparison  of  the  estimated  regression  coefficients 
associated  with  the  SRB  variable,  an  overall  projection  model,  applicable  to  all  MOS. 
was  developed.  Wc  selected  24  high  density  MOS,  which  had  active  SRB  histories  in 
our  sample  period,  to  include  in  our  initial  analysis.  Step-wise  multiple  linear  regression 
analysis  was  used  to  find  a  best  overall  explanatory  model,  which  could  be  used  to 
project  retention  at  the  MOS  level,  flic  proposed  overall  model  follows: 

V  =  P(l  4_  (5 1 N  i  +  F>2\:  ^  P 3 x 2 ^  (Nl) 

-  +  P5x4  +  p()x5  +  c 

where: 

Y  =  retention  rate 
X,  =  SRB  level 
X->  =  fiscal  quarter 

X^  =  rate  representing  the  race  profile  of  an  eligible  population 
Xq  =  rate  representing  the  dependent  profile  of  an  eligible  population 
X  ^  =  rate  representing  the  real  change  in  a  soldier's  pay  through  time 
c  =  error  component  with  assumed  distribution  V  n,  (7~  ) 

and  |>  is  a  sector  ol  the  parameters  to  be  estimated. 


We  note  that  the  X-.  variable  is  included  to  account  for  the  effects  of  the 
observed  seasonal  behavior  in  the  retention  rate.  We  note  also  that  no  variable  is 
included  in  the  proposed  overall  model  which  accounts  for  the  effects  of  promotion 
program  management. 

Personnel  inventory  managers  at  MIIPfRCLiN  view  the  Army  promotion 
program  as  a  force  alignment  tool  in  the  same  way  that  accession  and  reclassification 
programs  are  viewed.  Promotion  opportunity  to  grades  1:5  and  1:6  are  managed  at  the 
MOS  level  with  the  intention  of  providing  incentives  (or  disincentives)  for  zone  A 
soldiers  to  reenlist  for  their  entry  MOS.  In  not  including  an  independent  variable  in 
our  proposed  overall  model  to  account  for  this-  mechanism,  wc  make  no  conclusions  as 
to  it's  effectiveness,  but  we  do  conclude  that  the  statistic  provided  us  to  measure  it's 
effect  is  inadequate  for  that  purpose.  The  measure  preferred  by  the  Mil.PIiRCfN 
program  managers,  promotion  cut-off  score,  was  unavailable  during  the  period  of  our 
analysis.  We  recommend  that  an  analysis  similar  to  the  one  described  in  this  report, 
including  the  promotion  cut-off  scores,  be  performed  when  a  sufficient  base  of  historic 
records  are  available. 

We  selected  5o  moderate  density  MOS.  which  had  active  SRB  histories  in  our 
sample  period,  to  include  in  our  validation  analysis.  Our  results  from  this  analysis  were 
very  favorable.  We  recommend  the  proposed  model  for  use  in  predicting  retention  for 
all  MOS  with  the  following  caveats: 

1  Care  must  be  taken  in  extrapolating  beyond  the  region  defined  by  our  sample 
space. 

2  Reliance  on  point  estimates  for  retention  become  increasingly  dangerous  as  tire 
density  of  the  MOS  decreases. 

3  When  the  estimated  regression  coefficient  of  the  SRB  variable  is  interpreted  us 
the  effect  of  vurvine  the  SRB  level  while  the  level  of  all  other  factors  remains 
unchanged,  the  range  of  values  for  which  that  interpretation  is  valid  must  be 
respected. 

When  the  regression  coefficients  cannot  be  reliably  estimated  from  the  available 
data,  we  recommend  the  use  of  the  estimated  regression  coefficients  of  a  like  and  more 
reliably  modelled  MOS.  Wc  prefer  this  alternative  to  the  method  of  creating  MOS 
groups,  as  has  been  done  in  past  studies,  for  two  reasons,  first,  a  decision  is  explicitly 
made  by  the  SRB  program  manager,  as  to  which  particular  MOS  model  can  best 
represent  the  MOS  of  concern.  Second,  the  problems  associated  with  grouping  MOS 
unique  data,  such  as  SRB  level,  are  avoided. 


The  actual  estimated  regression  coefTicicnts  developed  for  each  YIOS  in  our 
analysis  have  not  been  included  in  this  report.  Instead,  this  analysis  has  been 
conducted  using  only  those  programming  languages  and  analytic  software  available  to 
the  DCSPLAXS,  MILPERCEN,  Force  Plans  Branch.  All  program  code  required  to 
implement  the  analytic  processes  described  in  this  report  are  included  as  appendices 
and  referenced  as  appropriate. 

It  is  recommended  that  the  regression  coefficients  be  estimated  on  a  periodic 
basis  using  the  programs  and  procedures  described  in  Chapter  IV  of  this  report,  [fit 
becomes  apparent  that  the  overall  model  is  no  longer  adequate,  cither  through 
examination  of  the  residuals  or  because  some  mcasurcahle  factor  not  included  in  the 
overall  model  has  become  critical  to  the  rccnlistment  decision  (  as  could  occur  with  a 
change  to  the  EPMS  ).  then  we  recommend  that  a  zone  A  retention  model  be  newly 
developed  following  the  procedures  set  forth  in  Chapter  111  of  this  study. 


ooo  oo  ooo  oo  non  oo 


APPENDIX  A 

FORTRAN  PROGRAM  TO  PRODUCE  DEMOGRAPHIC  RATES 


£*************  variable  declarations  ** * * ********** ******* a* * * ** ** * **** ** 
c 


REAL  REUPR( 5 , 250, 20), RACERf 5 ,250 .20),DEPR(5,250,20), 

1SEXR( 5 , 250 , 2d) , cl VEDft( 5 .250 , 50) , AFQTft( 5,250^20); TERMR( 5,250,20) 

CHARACTER*!  PMCS*3,RACE,MRST,SEX,CIVED,TG7.VOS(250)*3 


T0TREC  =  0 

***************  M0S  TARGETS  *  *  **  *  *****  *  *  ************  *  *  *  ***  *  ***  *  *  *  *  * 


DO  5  I  =  1,250 

READ( 5 , 101 , END=9)  TGTMOS(I) 
5  CONTINUE 
101  FORMAT (A3) 


9  T0TM0S  =1-1 


****************  i M ITI ALI Z AT ION  **************************************** 


DO  10  A=1 ,5 

DO  11  6=1,250 
DO  12  C=1 , 20 
rectot(a,b,c)=o 

DEPY(A,B,C)=U 
SEXYfA  B,C;=0 
CIVEDY(A.B,C)=0 
AFQTY(A,B,C)=0 
REUPYfA  B  C  =0 
TERMYf A,B,C j=0 
0THER(A,C)=C 

*,c  =c 


REUP0(A,C)=0 


12  CONTINUE 
11  CONTINUE 
10  CONTINUE 


RACER(A,B ,C)=0.  0 
DEPR(A,6,C)=0.  0 
SEXRf A, B ,  C)=0.  0 
CIVEDR (a!B1C)=0.  0 
AF0TR(A,6,6)=0.  0 
REUPR(A,B,C)=0. 0 
TERMR(A,B,C)=0. 0 


******** 


READ  EACH  RECORD  (APPROX  431K) 


******************************** 


15  READf 11, 102 , EN0=]9)  PM0S,REUP, LEVEL, TERM, EASDY.BASDM.EDATEY, 
1  F.DATEM  6aCE,  MARST ,  DEP ,  SEX .  C I  VtD . AFQt 
102  FORMAT (A3 , 3i 1 , 412 , 2A1 , I 1 , 2a1 , 12 } 


c 

c 


TOTREC  =  TOTREC  +  1 


c*****************  establish  TIS  **************************************** 

C 

L  TIS  =  ( EDATEY*12  +  EDATEM)  -  (BASDY*12  +  BAS DM) 

C 

IF(TIS.  LT.21)  THEN 
Z=1 

ELSE  IF(TIS.  LT.72)  THEN 
Z=2 

ELS E_ I F( TIS.  LT.  120)  THEN 

ELSE~IF(TI$.  LT.  168)  THEN 

ELSE 
Z=5 
END  IF 


c****************  ESTABLISH  QUARTER  ************************************* 
QTR  =  (((EDATEY*12  +  EDATEM)  -  970)  /  3)  +  1 
C 

IF(QTR.  LT.  1  .OR.  QTR.  GT.  20)  GO  TO  15 

c*******************  START  COUNT  **************************************** 
r 

DO  20  J=1 .TOTMOS 

IF(PMO$.  NE.  TGTMOS(J))  THEN 
GO  TO  20 
ctlsE 

RECTOT(Z,JaQTR)  =  RECTOT(Z , J ,QTR)  +  1 
ENOTERMY(U,0?R)'=  TERMY(Z , J ,QtR)  +  TERM 


****************** ** ***************** 


I F( RACE.  NE.  ‘C)  THEN 
RACEY(Z,J,QTR)  =  RACEYfZ . J ,QTR)  +  1 
C  OTHER  CODES:  C YELLOW) , N , R(AM£R  IND) , X , Z(UNK). 

C 

IFfDEP.  GE.  2)  THEN 
DEPYtz,J,QTR)  =  0EPY(Z , J ,QTR)  +  1 
END  IF 
C 

I Ff SEX. EQ. ' F’ )  THEN 
SE*y£z,J,QTR)  =  SEXY(Z,  J  ,QTR)  +  1 

C 

I F(CIVED.  GT.  'D1 )  THEN 
CIVEDY(Z,J,OTR)  =  CIVEDY(Z , J ,QTR)  +  1 
C  OTHER  CODES: 0,1,2,3,4,5,6,7,8,A,B,C,D,//E.  ..W,Y(N0  Z). 

ENDIF 

C 

IF(AFQT.  LT.  50)  THEM 
AFQTYCZ.J.QTRI  =  AFQTY(ZaJ,OTR)  +  1 
C  BRKPTS:  4(16-30)  ;^B(31-49),3A(5(5-M)  ,2(65-92), 1(93-99) 
ENDIF 
C 

IF(REUP.EQ.  1)  THEN 
REUPY(Z,J,QTR)  =  REUPY(Z.J.QTR)  +  1 


noon  onn 


«V> 
.V“' 


C 

c 


c 

c 


GC  TO  15 
20  CONTINUE 


IF(0TR.  LT.  1  .OR.  QTR.  GT.  20}  GO  TO  15 
OTHERfZ ,QTR)  =  OTHER(Z.QTR)  +  1 
IF( REUP.  EO.  I)  THEN 

REUPO(Z.QiR)  =  REUPO(Z.QTR)  +  1 
ENDIF 


GO  TO  15 


************* 


DIVIDE  TO  GET  RATES 


************************************** 


19 


DO  30  L  =  1,5 

DO  40  M  =  1, TOTMOS 
DO  50  N  =  1,20 


IF(RECTOT(L,M,N).  LT. 1)  GO  TO  5 
RACER(L,M,N)  =  (.FLOAT  ( RAC  EY(L, 
DEPR(L,H,jj)  =  (FLOAT(DEPY(L,M, 

■dr(l;m,,n)  = 


50 

40 

30 


SEXRCL.M.N)  =  ( FLOATf  SEXY{ L.M.N, , 
CIVEDR(L  M.N)  =  (FLOAT(CIVE&Y(L 
AFOTR(L,M,N)  =  ( FLOATY  AFQTY( L , M , N 1 
REUPRCL  M  N)  =  ( FLOAT ( REUPYr  L , M , N 1 
IF(REOPY(L,M,N).  LT.  1)  REUPYf L,M,N 
TERMRf  L,M,N)  =  ( F LOAT ( TERMY [  L .  M ,  N 
IF(REL)PY((,M,N).  EQ.  1000C0)  REUPY( 
CONTINUE 
CONTINUE 
CONTINUE 


)^FLOAT(REUPY(L,M,N))) 


****************** 


OUTPUT 


********************************************** 


DO  60  0=1 , TOTMOS 
WRITE(l3,514) 

60  CONTINUE 

DO  61  0=1. TOTMOS 
WRITE( 13,513) 

61  CONTINUE 


C 

C 


DO  62  0=1, TOTMOS 
WRITE( 13,514) 

62  CONTINUE 

DO  63  0=1. TOTMOS 
WRITE( 13 ,513) 

63  CONTINUE 


DO  64  0=1, TOTMOS 
WRITE( 13,514) 

64  CONI INUE 

DO  65  0=1, TOTMOS 
WRITE(l3,523> 

65  CONTINUE 


DO  66  0=1, TOTMOS 
WRITE( 13,514) 

66  CONTINUE 

DO  67  0=1, TOTMOS 
W  R I T  E ( 13,513) 

67  CONTINUE 


(REUPY(2,0,P),P=1,20) 
(REUPR(3,0,P) , P=1 ,20) 

(RACEY(2,0,P),P=1,20) 
( RACER(3 ,0 ,P) ,P=1 ,20) 

(DEPY(2 ,0, P) , P= 1,20) 
(DEPR( 3 ,0, P) , P=1 ,20) 

( SEX Y( 2, 0,P),P=1, 20) 

( SEXR( 3 ,0 , P) , P=1 , 20) 
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DO  68  0=1, TOTMOS 

WRITEC 13,514)  ( C I VEDY ( 2 , 0 , P ) ,P=1,20) 
63  CONTINUE 

DO  69  0=1. TOTMOS 

WRITE( 13 ,513)  (CIVEDR( 3 ,0,P) , P=1 ,20) 
69  CONTINUE 


DO  70  0=1. TOTMOS 

WRITE( 13,514)  (AFQTY(2 ,0,P) ,P=1 , 20) 

70  CONTINUE 

DO  71  0=1. TOTMOS 

WRITE( 13,513)  (AFQTR(3 ,0, P) , P=1 ,20) 

71  CONTINUE 

DO  72  0=1. TOTMOS 

WRITE! 13,514)  (TERMY(2 ,0,P) ,P=1 ,20) 

72  CONTINUE 

DO  73  0=1, TOTMOS 

WRITE! 13,513)  (TERMR( 3 ,0,P) , P=1 ,20) 

73  CONTINUE 


DO  74  0=1, TOTMOS 

WRITE! 13,514)  (RECT0T(2 ,0 ,P) ,P=1 ,20) 

74  CONTINUE 

DO  75  0=1, TOTMOS 

WRITE! 13,514)  (RECT0T(3 ,0,P) ,P=1 ,20) 

75  CONTINUE 


DO  76  0=1,5 

WRITE! 13,514)  ( OTHER ( 0, P) , P=1 ,20) 
76  CONTINUE 


DO  77  0=1,5 

WRITE! 13,514)  (REUP0(0,P) ,P=1 ,20) 
77  CONTINUE 


c*******************  FORMATS  ******************************************** 
C 

513  FORMAT! 20( F5.  3 , IX) ) 

514  FORMAT(20( 15 , IX) ) 

STOP 

END 


APPENDIX  B 
CORRELATION  MATRIX 


REUP 

SRB 

RACE 

REUP 

1.  00000 

0. 40106 

0. 34487 

SR3 

0.  40106 

1. 00000 

-0. 16719 

RACE 

0.  34487  -0. 16719 

1.  00000 

DEP 

0.24321 

0. 07366 

-0. 10544 

SEX 

0.  23577  -0.  19467 

0. 52904 

EDUCATE 

-0.  16155  -0. 00109 

0. C0509 

AFQT 

0.  19732  -0. 11239 

0.  55851 

E5TEST2 

0.01142 

0.  35953 

-0. 21260 

E6TEST2 

-0.08781 

0. 30615 

-0.29116 

QTR 

-0.32714  -0.06137 

-0. 01095 

UNEMPLY 

0.  17660  -0.21680 

0. 12927 

REAL 

0.25425  -0.10198 

0. 14387 

E5TEST2  E6TEST2  QTR 
REUP  0.01142  -0.08781  -0.32714 
SRB  0.35958  0.30615  -0.06137 

RACE  -0.21260  -0.29116  -0.01095 
DEP  -0.06718  0.12697  -0.29463 
SEX  -0. 27534  -0. 51307  -0.02192 
EDUCATE  -0.12471  -0.11160  0.21847 
AFQT  -0.16466  -0.13838  0.00070 
E5TEST2  1.00000  0.12996  -0.00004 
E6TEST2  0.12996  1.00000  0.00005 

QTR  -0.00004  0.00005  1.00000 

UNEMPLY  -0.00011  -0.00007  -0.02815 
REAL  -0.00010  -0.00008  0.00000 


DEP 

SEX  EDUCATE 

AFQT 

0. 24321 

0. 23577  -0. 16155 

0. 19732 

0.07366 

-0. 19467  -0. 00109 

-0. 11239 

-0. 10544 

0.52904  0.0C509 

0. 55851 

1.00000 

-0.19403  0.01703 

0. C1983 

-0. 19403 

1.00000  0.25536 

0. 00782 

0.01708 

0.25536  1.00000 

-0. 33314 

0. 01983 

0.  C0782  -0.33314 

1. 00000 

-0. 06718 

-0. 27534  -0. 12471 

-0. 1S466 

0. 12697 

-0. 51307  -0. 11160 

-0. 13838 

-0. 29468 

-0.02192  0. 21847 

0.00070 

-0.09619 

0.07544  -0.61271 

0.  17733 

-0. 10925 

0.05044  -0.24003 

0. 06349 

UNEMPLY 

REAL 

0. 17660 

0. 25425 

-0.21680 

-0. 10198 

0. 12927 

0. 14387 

-0. 09619 

-0. 10925 

0. 07544 

0.  05044 

-0. 61271 

-0. 24003 

0. 17733 

0.  06349 

-0. 00011 

-0. 00010 

-0.00007 

-0. 00008 

-0.02815 

0. OCOOO 

1. 00000 

0. 62229 

0. 62229 

1. 00000 

APPENDIX  C 

SAMPLE  INPUT  FILE  -  SAS  PROC  STEPWISE 


■MOS=63B 


OBS 

REUP 

SRB 

RACE 

DEP 

I 

0. 565200 

1 

0.  315900 

0.  204300 

2 

0. 464800 

0 

0. 344600 

0. 207600 

3 

0. 356500 

■  0 

0. 325200 

0. 161900 

4 

0. 286700 

0 

0. 359400 

0.  165000 

5 

0. 382400 

0 

0. 393200 

0. 199100 

6 

0. 680100 

0 

0. 454000 

0. 211400 

7 

0. 483900 

0 

0. 388200 

0. 233400 

8 

0. 403300 

0 

0. 418300 

0. 201600 

9 

0. 532600 

0 

0. 435400 

0.  245700 

10 

0. 554500 

0 

0. 397200 

0. 260700 

2 1 

0. 416500 

0 

0. 399600 

0. 210400 

12 

0. 293900 

0 

0. 385900 

0. 175800 

13 

0. 422500 

0 

0. 399200 

0. 246100 

14 

0. 523500 

0 

0.  395500 

0.  222200 

15 

0. 394400 

0 

0. 380500 

0. 228600 

16 

0. 270500 

0 

0. 384100 

0. 209100 

17 

0. 345300 

0 

0. 351300 

0. 225500 

18 

0. 312400 

0 

0.  325300 

0. 249500 

19 

0. 367100 

0 

0. 384900 

0.  170600 

20 

0. 247700 

0 

0.  361700 

0.  165700 

OBS 

SEX 

EDUCATE 

AFQT 

E5TEST2 

1 

0. 0377000 

0. 782600 

0. 636200 

0. 1790 

2 

0. 0392000 

0. 797700 

0.674900 

0. 2210 

3 

0. 0435000 

0. 789100 

0.  650300 

0. 1120 

4 

0. 0350000 

0.  888100 

0.  664300 

-0. 0710 

5 

0. 0407000 

0.  334800 

0. 653800 

-0. 1250 

6 

0. 0368000 

0.  829000 

0. 704000 

0.  2080 

7 

0. 0491000 

0.  909100 

0.  653600 

-0. 3500 

8 

0. 0354000 

0. 862400 

0.  666200 

-0. 2920 

9 

0. 0430000 

0. 773200 

0.  721500 

-0. 2170 

10 

0. 0248000 

0.  773800 

0. 732400 

-0. 3000 

11 

0. 0381000 

0.  816100 

0. 683900 

-0. 4960 

12 

0. 0505000 

0.  851500 

0. 643400 

-0. 5790 

13 

0. 0988000 

0.  829500 

0.  676400 

-0. 6040 

14 

0. 0885000 

0. 811700 

0.  655400 

-0. 5040 

15 

0. 0628000 

0.  844700 

0. 713800 

-0. 3960 

16 

0. 0420000 

0.  902300 

0.  622700 

-0. 2330 

17 

0. 0659000 

0.  924200 

0. 592300 

-0. 3830 

18 

0. 0813000 

0. 883500 

0. 659900 

-2. 6380 

19 

0. 0496000 

0. 932500 

0.  730200 

-1. 3460 

20 

0. 0395000 

0.  952900 

0. 597300 

-1. 2960 

E6TEST2 


UNEMPLY 


REAL 


■1.  1250 
-1. 5960 
-1. 5290 
■1. 2330 
-1. 0130 
-0. 9500 
■1. 0830 
-1. 2880 
-1. 4250 
-1. 4250 
-1. 3670 
-0. 6080 
-0. 7540 
-0. 6380 
-0. 6290 
-0. 7420 
0. 1500 
-0. 4580 
-0. 3790 
-0. 3830 


7.  4000 

0.  80000 

7. 4000 

0. 80000 

7.4000  ' 

0.  80000 

8. 2000 

0.  80000 

8. 8000 

9.  30000 

9. 5000 

9.  30000 

9. 9000 

9.  30000 

10. 6000 

9.  30000 

10. 4000 

1.  10000 

10. 1000 

1. 10000 

9. 3000 

1.  10000 

8. 5000 

1.  10000 

7. 9000 

-0.  20000 

7. 5000 

-0.  20000 

7. 4000 

-0.  20000 

7. 2000 

-0.  20000 

7. 3000 

0.80000 

7. 3000 

0. 80000 

7. 2000 

0. 80000 

7. 0000 

0. 80000 

APPENDIX  D 

SAMPLE  OUTPUT  FILE  -  SAS  PROC  STEPWISE 

MOS=63B 

STEPWISE  REGRESSION  PROCEDURE  FOR  DEPENDENT  VARIABLE  REUP 


STEP  1  VARIABLE  EDUCATE  ENTERED  R  SQUARE  =  0.  40205418 

C(P)  =  45.63109799 


DF  SUM  OF  SQUARES 

MEAN  SQUARE 

F 

PRCB>F 

REGRESSION 

ERROR 

TOTAL 

1 

IS 

19 

0. 09995765 
0. 14865970 
0. 24861735 

0. 09995765 

0. 00825887 

12.  10 

0.  0U27 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

EDUCATE 

1. 52788549 
-1. 30962992 

0. 37644450 

0. 09995765 

12.  10 

0.  0027 

BOUNDS  ON 

CONDITION  NUMBER: 

1, 

2 

STEP  2 

VARIABLE  REAL  ENTERED 

R  SQUARE  =  0. 54491298 

C(P)  =  32.90644519 

DF  SUM  OF  SQUARES 

MEAN  SQUARE 

F 

PR0B>F 

REGRESSION 

ERROR 

TOTAL 

2 

17 

19 

0. 13547432 
0.  11314253 
0. 24361735 

0. 06773741 

0. 00665544 

10.  18 

0. 0012 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PRQB>F 

INTERCEPT 

EDUCATE 

REAL 

1. 54759367 
-1. 36639281 

0. 01207975 

0. 33882394 
0. 00522910 

0. 10823804 

0. 03551718 

16.  26 

5.  34 

0. 0009 
0. 0337 

BOUNDS  ON 

CONDITION  NUMBER: 

1. 005287, 

8. 042296 

STEP  3 

VARIABLE  QTRSQ  ENTEREO 

R  SQUARE  =  0.  64745038 

C(P)  =  24.33777424 

DF  SUM  OF  SQUARES 

MEAN  SQUARE 

F 

PR0B>F 

REGRESSION 

ERROR 

TOTAL 

3 

16 

19 

0. 16096740 
0. 08764995 
0. 24861735 

0. 05365580 

0. 00547812 

9.  79 

0.  0007 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

EDUCATE 

nap 

1. 27877948 
-0. 98469643 
-0. 00725385 

0. 01165255 

0. 35468525 
0. 00336262 
0. 00474824 

0. 04222311 

0. 02549258 

0. 03299198 

7.71 

4.  65 

6.  02 

0.  0135 
0. 0465 
0.  0260 

BOUNDS  ON 

CONDITION  NUMBER: 

1. 338361, 

22. 06034 

STEP  4 

VARIABLE  RACE 

ENTERED 

R  SQUARE  =  0. 69900259 

C(P)  =  21.02421694 

DF  SUM  OF  SQUARES 

MEAN  SQUARE 

F 

PROB>F 

REGRESSION 

1  4 

0.  17378417 

0.  04344604 

8.  71 

0.  00C3 

ERROR 

15 

0. 07483318 

0.  00498888 

TOTAL 

19 

0. 24861735 

B  VALUE 

STO  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

0. 88054612 

RACE 

0.83180763 

0.  51896133 

0. 01281677 

2.  57 

0.  1293 

EDUCATE 

-0. 87244692 

0. 34564563 

0. 03178474 

6.  37 

0. 0234 

QTRSQ 

-0. 00776269 

0. 00322462 

0. 02891151 

5.  80 

0.  0294 

REAL 

0. 00758029 

0. 00519492 

0.  01062225 

2.  13 

0.  1651 

BOUNDS  ON 

CONDITION  NUMBER:  1.395655,  43.2307 

STEP  5 

VARIABLE  REAL 

REMOVED 

R  SQUARE  =  0.  65627723 

C(P)  =  23.42 

797351 

DF  1 

SUM  OF  SQUARES 

MEAN  SQUARE 

F 

PRCE>F 

REGRESSION 

1  3 

0. 16316192 

0. 05433731 

10.  18 

0. 0005 

ERROR 

16 

0. 08545543 

0.  00534096 

TOTAL 

19 

0. 24861735 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PR03>F 

INTERCEPT 

0. 68753^24 

RACE 

1. 20215572 

0. 46336293 

0. 03513650 

6.  59 

0.  0207 

EDUCATE 

-0. 73645354 

0. 35239792 

0. 02660107 

4.  98 

0.  0403 

QTRSQ 

-0. 00815953 

0. 00332457 

0. 03217236 

6.  02 

0. 0259 

BOUNDS  ON 

CONDITION  NUMBER:  1.355083,  22. 25706 

STEP  6  VARIABLE  SRB  ENTERED  R  SQUARE  =  0.74588024 

C(P)  =  16.19247340 


DF  SUM 

OF  SQUARES 

MEAN  SQUARE 

F 

PR03>F 

REGRESSION 

4 

0. 18543877 

0.  04635969 

11.  01 

0.  0002 

ERROR 

15 

0. 06317858 

0.  00421191 

TOTAL 

19 

0. 24861735 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

0.  30239640 

SRB 

0.  18328386 

0. 07969602 

0. 02227685 

5.  29 

0. 0352 

RACE 

1. 71637743 

0. 47221401 

0. 05564504 

13.  21 

0. 0024 

EOUCATE 

-0.  58192906 

0.  32533230 

0. 01347610 

3.  20 

0. 0939 

QTRSQ 

-0.  00726620 

0.  00297778 

0. 02507889 

5.  95 

0. 0276 

BOUNDS  ON  CONDITION  NUMBER: 


1.  464518, 


44. 55447 


STEP  7 


VARIABLE  QTR  ENTERED 


R  SQUARE  =  0. 88534414 
C(P)  =  3. 81773703 


DF  SUM 

OF  SQUARES 

MEAN  SQUARE 

F 

PROB>F 

REGRESSION 

5 

0. 22011191 

0. 04402238 

21.  62 

0.  0001 

ERROR 

14 

0. 02850544 

0. 00203610 

TOTAL 

19 

0. 24851735 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

-0. 27080045 

SRE 

C.  27959908 

0. 06013548 

0. 04404739 

21.  63 

0. 0004 

RACE 

2. 04097645 

0. 33761270 

0. 07441116 

36.  55 

0. 0001 

EDUCATE 

-0. 34220227 

0. 23353805 

0. 00437170 

2.  15 

0. 1649 

QTR 

0. 23163421 

0. 05614352 

0. 03467315 

17.  03 

0. 0010 

GTRSQ 

-0. 05231981 

0. 01111232 

0. 04513596 

22.  17 

0.  0U03 

BOUNDS  CN 

CONDITION  NUMBER: 

39. 11734, 

824. 5S04 

STEP  8 

VARIABLE  EDUCATE 

REMOVED 

R 

SQUARE  =  0. 867 

76010 

C(P)  =  3.630 

14885 

DF  SUM 

OF  SQUARES 

MEAN  SQUARE 

F 

PRC3-F 

REGRESSION 

4 

0. 21574022 

0. 05393505 

24.  61 

0. CGC1 

ERROR 

1 5 

0. 03287713 

0. 00219181 

TCTAl 

19 

0. 24861735 

3  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PRCE>F 

INTERCEPT 

-0. 62921701 

SAB 

0.  30971234 

0. 05856171 

0. 06109564 

27.87 

0. 0001 

RACE 

2.  18472^53 

0. 33517003 

0. 09312487 

42.  49 

0.  0U01 

QTR 

0.  25214804 

0. 05641976 

0. 04377755 

19.97 

0. 0005 

Otrsq 

-0.  05759786 

0. 01090687 

0. 06112444 

27.89 

0.  0001 

BOUNDS  ON 

CONDITION  NUMBER: 

36. 30779, 

592.6314 

STEP  9 

VARIABLE  E5TEST2 

ENTERED 

R 

SQUARE  =  0.  912 

78288 

C(P)  =  0.98958872 

DF  SUM 

OF  SQUARES 

MEAN  SQUARE 

F 

PR08>F 

REGRESSION 

5 

0. 22693366 

0. 04538673 

29.  30 

0. 0001 

ERROR 

14 

0. 0216S369 

0. 00154883 

TOTAL 

19 

0. 24861735 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

-0. 51081673 

SRB 

0.  26583755 

0. 05194297 

0. 04056805 

26.  19 

0.  0002 

RACE 

1.  90943944 

0. 29978309 

0. 06283515 

40.  57 

0. 00U1 

E5TEST2 

0. 03974190 

0. 01478323 

0. 01119344 

7.  23 

0. 0177 

OTR 

0. 25877750 

0. 04749181 

0. 04598548 

29.  69 

0. 0001 

QTRSQ 

-0. 05890404 

0. 00918143 

0. 06374917 

41.  16 

0.  OOul 

BOUNDS  CN  CONDITION  NUMBER:  36.40594,  758.1488 


STEP  10  VARIABLE  UNEMPLY  ENTERED  R  SQUARE  =  0. 92644837 

C(P)  =  1.58106751 


DF  : 

SUM  OF  SQUARES 

MEAN  SQUARE 

F 

PROB>F 

REGRESSION 

6 

0. 23033114 

C.  03838852 

27.  29 

0.  0001 

ERROR 

13 

0. 01828621 

0. 00140663 

TOTAL 

19 

0. 24861735 

B  VALUE 

STD  ERROR 

TYPE  II  SS 

F 

PROB>F 

INTERCEPT 

-0.  49003893 

SRB 

0.  2 5534625 

0. 04995923 

0.03674582 

26.  12 

0.  0002 

RACE 

1.  52242447 

0. 37898701 

0. 02269882 

16.  14 

0. 0015 

E5TEST2 

0.  03637129 

0. 01425421 

0.00915823 

6.  51 

0. 0241 

QTR 

0.  25377900 

0. 04537328 

0.  04400392 

31.  28 

0.  0001 

QTRSQ 

-0.  05800326 

0. 00876897 

0.  06154427 

43.  75 

0.  0001 

UNEMPLY 

0.  01577333 

0.  01014928 

0. 00339748 

2.  42 

0.  1442 

BOUNDS  ON  CONDITION  NUMBER:  36.58979,  952.6237 


NO  OTHER  VARIABLES  MET  THE  0.1500  SIGNIFICANCE  LEVEL  FOR  ENTRY 


SUMMARY  OF  STEPWISE  REGRESSION  PROCEDURE  FOR  DEPENDENT  VARIABLE  REUP 


VARIABLE 

NUMBER 

PARTIAL 

MODEL 

STEP 

ENTERED 

REMOVED 

IN 

R**2 

r**2 

C(P) 

1 

EDUCATE 

1 

0.4021 

0. 4021 

45. 6311 

2 

REAL 

2 

0.  1429 

0. 5449 

32. 9064 

3 

3 

0.  1025 

0. 6475 

24.  3378 

4 

4 

0. 0516 

0. 6990 

21. 0242 

5 

REAL 

3 

0.0427 

0.  6563 

23.  4280 

6 

SRB 

4 

0.  0896 

0. 7459 

16. 1925 

7 

QTR 

5 

0.  1395 

0. 8853 

3.8177 

8 

EDUCATE 

4 

0. 0176 

0.8678 

3. 6301 

9 

E5TEST2 

5 

0. 0450 

0.  9128 

0. 9896 

10 

UNEMPLY 

6 

0.0137 

0. 9264 

1. 5811 

VARIABLE 


STEP 

ENTERED 

REMOVED 

F 

PROB>F 

1 

EDUCATE 

12. 1031 

0. 0027 

2 

REAL 

5. 3366 

0. 0337 

3 

QTRSQ 

4. 6535 

0. 0465 

4 

RACE 

2. 5691 

0. 1298 

5 

REAL 

2. 1292 

0. 1651 

6 

SRB 

5. 2890 

0.  0362 

7 

QTR 

17. 0292 

0. 0010 

8 

EDUCATE 

2. 1471 

0.  1649 

9 

E5TEST2 

7.2270 

0. 0177 

10 

UNEMPLY 

2. 4153 

0.  1442 

APPENDIX  E 

SAMPLE  OUTPUT  FILE  -  SAS  PROC'  REG 


DEP  VARIABLE:  REUP 


ANALYSIS  OF  VARIANCE 


SOURCE 


OF 


SUM  OF 
SQUARES 


MEAN 

SQUARE 


MODEL 
ERROR 
C  TOTAL 


6 

13 

19 


0.21740661  0.03623444 

0.03121074  0.002400326 

0. 24361735 


F  VALUE 
15. 092 


PR0B>  F 
0. 0001 


ROOT  MSE 
DEP  MEAN 
C.  V. 


INTERCEP 

QTR 

QTRSQ 

5-tti 

RACE 

DEP 

REAL 


1 


03S 


0. 04899822 
0. 41544 
11.  7943 


R-SQUARE 
AOJ  R-SQ 


0. 8745 
0.  8165 


PARAMETER  ESTIMATES 


VARIABLE  DF 


PARAMETER 

ESTIMATE 


STANDARD 

ERROR 


T  FOR  HO: 
PARAMETERS 


PROB  >  | T | 


-0. 59285005 
0. 24775924 
-0. 05638160 
0. 30434495 
2. 00096212 
0.  13621363 
0. 002993859 


0. 18763783 
0.  05948661 
0. 01178244 
0. 06217371 
0. 41845916 
0. 50887958 
0. 003637893 


-3. 160 
4.  165 
-4. 785 
4.895 
4.  782 
0.  263 
0.823 


0. 0075 
0. 0011 
0.  0004 
0. 0003 
0. 0004 
0.  7932 
0. 4254 


ACTUAL 

PREDICT 

VALUE 

STD  ERR 
PREDICT 

LOWER95°o 

MEAN 

UPPER95% 

MEAN 

LOWER 9 5T1 
PREDICT 

1 

0. 5652 

0. 5652 

0. 0490 

0.  4593 

0. 6711 

0.  <1155 
0.  2317 

2 

0. 4648 

0. 3973 

0. 0215 

0.  3509 

0. 4433 

3 

0. 3565 

0. 3182 

0. 0304 

0.  2526 

0.  3837 

0.  1936 

4 

0.  286/ 

0. 2401 

0. 0237 

0.  1889 

0. 2913 

0.  1225 

5 

0. 3824 

0. 4503 

0. 0356 

0.  3734 

0.  5272 

0.  3194 

6 

0. 6801 

0. 6422 

0. 0339 

0.  5689 

0. 7155 

0.  5135 

7 

0. 4839 

0. 4794 

0. 0327 

0.  4087 

0.  5501 

0.  3521 

8 

0. 4033 

0. 3884 

0. 0322 

0.  3157 

0.  4580 

0.  2617 

9 

0. 5326 

0. 5085 

0. 0306 

0.  4424 

0.  5746 

0.  3837 

10 

0. 5545 

0. 5107 

0. 0246 

0.  4575 

0.  5639 

0.  3922 

1 1 

0.  4165 

0. 4745 

0. 0202 

0.  4309 

0.  5182 

0.  3600 

12 

0. 2939 

0. 2955 

0. 0225 

0. 2469 

0.  3441 

0. 1790 

13 

0. 4225 

0. 4302 

0. 0267 

0. 3726 

0.  4878 

0. 3097 

14 

0.  5235 

0. 4982 

0.  02.12 

0.  4524 

0.  5440 

0.  3829 

1 5 

0.  3944 

0.  4349 

0.  0210 

0.  3895 

0.  4802 

0.  3197 

16 

0.  2705 

0. 2925 

0. 0265 

0.  2353 

0.  3497 

0.  1722 

17 

0.  3453 

0.  3346 

0. 0287 

0.  2726 

0.  3965 

0.  2119 

18 

0.  3124 

0. 3544 

0. 0325 

0.  2943 

0.  4346 

0.  2375 

19 

0.  3671 

0. 4388 

0. 0273 

0.  3798 

0.  4978 

0.  3176 

20 

0.  2477 

0.  24',8 

0. 0234 

0.  1941 

0.  2954 

0.  1274 

OBS 


UDPER95°o  STD  ERR  STUDENT 

PREDICT  RESIDUAL  RESIDUAL  RESIDUAL 


-2-1-0  1  2 


1 

0.  7149 

1. 5E-16 

0 

7 

0.  5130 

0. 0675 

0. 0440 

1.  5322 

3 

0. 4427 

0. 0383 

0. 0385 

0.  9969 

4 

0.  3577 

0. 0466 

0. 0429 

1.  0865 

5 

0.  5811 

-0. 0673 

0. 0337 

-2. 0154 

5 

0. 7710 

0. 0379 

0. 0354 

1. 0715 

7 

0. 6067 

.  0094981 

0. 0365 

0.  2605 

8 

0. 5151 

0. 0143 

0. 0369 

0. 4041 

9 

0. 6333 

0. 0241 

0. 0383 

0. 6294 

10 

0. 6292 

0. 0438 

0. 0424 

1.  0334 

11 

0. 5390 

-0.  0580 

0. 0446 

-1.  3000 

12 

0. 4120 

001532 

0.  04  35 

-0.  0366 

13 

0.  5507 

-. 007735 

0. 0411 

-0.  1882 

14 

0. 6135 

0. 0253 

0. 0442 

0.  5730 

15 

0.  5500 

-0. 0405 

0.  0*143 

-0.  9145 

16 

0. 4129 

-0. 0220 

0. 0412 

-0.  5344 

17 

0. 4572 

0. 0107 

0.0397 

0.  2699 

18 

0. 4914 

-0. 0520 

0.  0367 

-1.  4173 

19 

0. 5600 

-0. 0717 

U. 0407 

-1. 7622 

20 

0. 3621 

. 0023C51 

0. 0430 

0. 0675 

COCK'S 
Q5S  D 


1 

? 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 

14 

15 
15 

17 

18 

19 

20 


0.  03C 
0.  083 
0.  051 
0.  650 
0.  151 
0.  003 
0.  013 
0.  036 
0.  052 
0.  049 
0.  000 
0.  002 
0.  on 

0.  02  7 
0.  L 1  7 
0.  005 
0.  225 
0.  200 
0.  OCO 


SUM  OF  RESIDUALS 

SUM  CF  SQUARED  RESIDUALS 


2. 49300E-16 
0.  03121074 


*  *  * 

* 

*  * 

*  *  rt  * 

*  * 

* 

*  * 

** 

* 

* 

■k 

*  * 

*  *  * 
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APPENDIX  F 

SAMPLE  INPUT  /  OUTPUT  FILES  -  SAS  PROC  MATRIX 


***********************  input  ********************************** 


X'XINV 

C0L1 

C0L5 

C0L2 

COL6 

C0L3 

C0L7 

C0L4 

a 

5. 9325 
-1. 67589 

-1. 91717 
-16. 3181 

0. 301021 
-0. 0252033 

0. 114193 

12 

-1. 91717 
0. 0739555 

1. 37198 
2. 35476 

-0. 258016 
0. 00429438 

-0.  0292765 

B 

0. 301021 
-0. 0205588 

-0. 258016 
-0. 219774 

0. 0509633 
-. 000129725 

0.  00705693 

1 4 

0. 114193 
-0. 407743 

-0. 0292765 
-0. 585478 

0. 00705693 
0. 00624465 

0. 10918 

15 

-1. 67589 
7.891 

0.  0739555 
0. 288322 

-0. 0205588 
-0. 115236 

-0.  407743 

,’6 

-16. 3181 
0. 288322 

2. 35476 
63. 4104 

-0. 219774 
0. 155382 

-0.  585478 

17 

-0. 0252033 
-0. 115236 

0.  00429438 
0.  155382 

-.000129725 
0. 0061736 

0.  00624465 

XO 

COL1 

R0W1 

R0W2 

ROW  3 

ROW 4 

ROW  5 

R0W6 

ROW  7 

1 

2 

4 

0 

0.  4 
0.  25 

3 

b1 

C0L1 

C0L5 

COL2 
CO  16 

COL3 

C0L7 

C0L4 

-0. 104345 
0. 32371 

0.  1344P0 
0.  33820 i 

-0. 031 74nu 

0. 6116/23 

0. 0491367 

EMS 

CO  LI 

R0W1 

0.  00610136 

TCRIT 

C0L1 

R0W1 

1.  771 

6S 


aBMSaMi^^ 


*********  *** 


OUTPUT  1 


SRB  LEVEL  =  0 


*********************** 


YO 


VAR(YO) 


Cl (LOW) 


ROW1 


ROW1 


ROW1 


C I ( HIGH ) 


ROW1 


COL1 

0.425821 

C0L1 
0. 0050044 

C0L1 
0. 300537 

C0L1 
0. 551105 


************  OUTPUT  2.  ~  SRB  LEVEL  =  1  *********************** 


YO 


VAR(YO) 


Cl (LOW) 


R0W1 


R0W1 


R0W1 


CI(HIGH) 


R0W1 


C0L1 

0.474958 

C0L1 

0.00314623 

C0L1 
0. 37562 

C0L1 
0. 574295 


************  OUTPUT  3  -  SRB  LEVFL  =  2  *********************** 


YO 


VAR(YO) 


Cl (LOW) 


R0W1 


R0W1 


C0L1 
0. 524094 

C0L1 
0. 00262035 

C0L1 
0.  433438 


R0W1 


69 


CI(HIGH) 


C0L1 
0. 614751 


R0W1 


***************** 


OUTPUT  4  —  SRB  LEVEL  =  3  *********************** 


£  y° 

R0W1 


■IV 


1 

VAR(YO) 

R0W1 

Kg 

C I ( LOW ) 

in 

R0W1 

CI(HIGH) 

i 

row: 

n 

IP 

tv? 

p 

C0L1 
0. 573231 

CCL1 

0.00342676 

C0L1 

0.469559 

C0L1 
0. 676903 
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APPENDIX  G 


EXTRACT  OF  SAS  V5  PROGRAMMING  COMMANDS  USED  IN  THIS 

STUDY 


OPTIONS  LINESIZE=64 
PAGES  I ZE=60; 

DATA  ARRAY 1; 

INPUT 

MOS  S  SRB  REUP  RACE  DEP  SEX  EDUCATE  AFOT  TERM  E5TEST1  E6TEST1 
E5TEST2  E6TEST2  QTR  UNEMPLY  REAL  SEQ  YEAR; 

CARDS; 


(include  data  arrays) 


PRINT 

DATA=ARRAY1  N  UNIFORM; 

VAR  REUP  SRB  RACE  DEP  SEX  EDUCATE  AFQT  E5TEST2  E6TEST2 
QTR  UNEMPLY  REAL; 

BY  MOS: 

CORR  DA'^ARRAYl  NOSIMPLE; 

VAR  REUP  SRB  RACE  DEP  SEk  EDUCATE  AFQT  E5TEST2  E6TEST2 
QTR  UNEMPLY  REAL; 

PRINT: 

STEPWlSE  DATA=ARRAY1: 

MODEL  REUP  =  SRB  SR&*2  RACE  DEP  SEX  EDUCATE  AFQT  E5TEST1  E6TEST1 
E5TEST2  E6TEST2  TERM  QTR  QTR*2  SEQ  YEAR  UNEMPLY  REAL 
/  SLE=. 150  SLS=. 150; 

BY  MOS; 

RCQ 

DATA=ARRAY1; 

MODEL  REUP  =  QTR  QTRSQ  SRB  RACE  DEP  REAL  /  I  P  R  CLM  CLI  INFLUENCE; 
BY  MOS; 

OUTPUT  OUT=OUTl  P=YHAT1  R=RESID1; 

CHART 

DATA=0UT1: 

HBAR  RES  I bl/MIDPOI NT S=— .  25  TO  .25  BY  .010; 

PLOT 

DATA=0UT1: 

PLOT  RESlt)l*YHATl='  * 1  RESID1*SEQ= 1  * 1 /VREF=0; 

OUT 11: 

SET  &UT1; 

IF  SEQ=1  THEN  DELETE; 

Rll=RfcSIDl; 

OUT  41- 

set  Out i ; 

IF  SHQ<=4  THEN  DELETE; 

R41--RES I D1 ; 

OUT  1 2 • 

SET  QUTl- 

IF  SEQ=2&  THEN  DELETE; 

R12=RES1D1; 

0UT42, 

SET  fcUTl: 

IF  SEQ>=l7  THEN  DELETE; 

R42=RESI01; 

LAG1: 

MERQE  OUT11  OUT12; 


Is’S 


DATA  LAG4- 

MERfeE  0UT41  0UT42; 

PROC  PLOT 

oata=lagi- 

PLOT  Rll*fU2=' * 1  /  VREF=0  HREF=0; 
PROC  PLOT 

f)ATA=LAG4' 

PLOT  R41*ft42='  * 1  /  VREF=0  HREF=0; 
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